Monday, 1 December 2008

Implementation inheritance is like playing russian roulette

Short version:

Implementation inheritance breaks encapsulation and leads to the Fragile Base Class (FBC) problem.

Lil' longer version:

Here is the list of buzzwords I'm going to use along this post: implementation inheritance, method precondition, method postcondition, class invariant, encapsulation, Liskov substitution principle. Let us give some "loose definitions" to them:

  • implementation inheritance is when you use class inheritance as a mechanism for reusing behaviour. This concept is opossed to "interface inheritance" by which a class just inherits "the contract" it has to comply with.
  • method preconditions are the set of conditions that the caller of a method needs to assure before actually calling it (i.e. argument 'x' is greater than 0, argument 'parent' is not null, etc).
  • method postconditions are the result of a call to a method (either as a return value of as its effect over the state of the object it belongs to).
  • class invariant is the set of constraints that express what means for an object of such class to be in a consistent state (i.e. 'width' atribute an instance of class 'Window" must not be less than zero).
  • encapsulation (a.k.a. Information hiding) is the principle by which in a class you "hide" anything (state/behaviour) that is not really needed to be known by other classes, thus reducing their dependence on the mentioned class to a bare minimum. This way, you reduce the probability of needing to change the former ones upon modification on the later.
  • Liskov Substituition principle (LSP) (a.k.a. Design by contract) states that is should be possible to treat an instance of a subclass as if it were a base class object, meaning a subclass must comply with everything that could be expected from its parent.
Now let's ask some questions about these terms so we can get some ideas...

First question: How can the developer "communicate" the preconditions and postconditions of a method?

You can choose among some options to let the reader know the pre/postconditions of a method:
  • Using language facilities: type constraints (e.g. if a parameter must be between 0 and 65535 you can set it to be an unsigned short in C++), assertions, etc.
  • Using comments: method headers usually express in natural language what cannot be expressed using programming language syntax(e.g. /* this class is not thread safe */ ).
  • Not communicate them: either because you don't feel it necessary to communicate them or simply do not know about them (in this case pre/postconditions still exist, they just are implicit in the method body).

Second question: What does LSP actually mean when it says a subclass must comply with everything that could be expected from its parent?

To call a method of an object, you first prepare its arguments as it expects them (preconditions) then perform the call and after that you expect some sort of result of the invocation (postconditions), either returned result or a variation in the object state. If such call were issued to a subclass, the mentioned preconditions and postconditions should still be perfectly valid.

Third question: So what is the relationship between the pre/postconditions and invariant of a class and its subclasses?
  • Preconditions of a subclass should not be stronger than its parent's.
  • Postconditions of a subclass should not be weaker than its parent's.
  • Invariant of a subclass should not be weaker than its parent's.

Fourth question: But that means a subclass is also responsible of maintaining its parent's pre/postconditions and invariant, isn't it?

Yeah, right. That's precisely the point. This is the very reason why implementation inheritance is said to break encapsulation, as the subclass has to know details of the implementation of its base class.

Fifth question: I guess such responsibility can be troublesome upon code changes, am I right?

You are quite right. Every change in the superclass' pre/postconditions or invariant, either explicit (due to changes in method signatures) or implicit (due to changes in the code) forces to verify in each subclass' pre/postconditions and invariant. To perform that verification task properly the coder needs to fully understand the intent of the code of both the base class and the derived one, which is not likely to happen (even if he's the same person who wrote both classes); this is why modifications to the base class often break subclasses. This is known as the "Fragile Base Class problem".

Sixth question: How does this stuff apply to interface inheritance?

With interface inheritance you also have to comply with pre/postconditions of the interface you implement, however in this case there are no implicit conditions within code (as there is no code at all), everything you have to comply with is either expressed with language constructs or as comments in method headers. Upon a change in methods signatures, implementors have also to be changed (otherwise they won't compile).

Seventh question: Your points seem to rely on the correctness of LSP, is it kinda dogma or what?

In Uncle Bob's words: "It is only when derived types are completely substitutable for their base types that functions which use those base types can be reused with impunity, and the derived types can be changed with impunity". Sure you can violate LSP, but not if you want to achieve the code reuse promise from object oriented programming.

Summing up....

Implementation inheritance breaks encapsulation because subclasses have to know every (either explicit or implicit) precondition and postcondition from its parent -plus its invariant- to comply with LSP; this knowledge that must be present in the subclass is likely to be flawed due to unknown (but existing) pre/post conditions, plus the mentioned elements can vary during the lifecycle of the project (e.g. maintenance phase), thus leading to the Fragile Base Class problem, by which modifications to the base class break subclasses.

1 comment:

  1. nice post, thx for reminding us that LSP is not about syntax but also (and probably most importantly) about semantics :) I guess preconditions, postconditions and invariants are the one thing I'm missing the most in the OO languages used in the mainstream.