Java Posse, .Equals(), Inheritance, and the Liskov Substitution Principle October 29, 2007 | 06:38 am

Continuing the conversation had by Raganwald, me, and a lot of the rest of the blogosphere:

The Java Posse Responds to My #Equals Issue
(If you don’t know the Java Posse, check out their website/blog/podcast. It’s good stuff.)

The answer they gave was an interesting one, which I’m still sitting on and working in my head. Here is Martin OConnor’s response, for those too lazy to read the (very interesting) thread:

Your class hierarchy is flawed. ThreeDeePoint does not naturally extend TwoDeePoint, because a Three Dimiensional point is not a kind of Two Dimensional point. They represent two very distinct concepts. Essentially, when you extend TwoDeePoint with ThreeDeePoint, you are saying that ThreeDeePoint is a special kind of TwoDeePoint. I would submit, therefore that the problems you are observing are less due to the impelmentation of Object.equals(), and more as a result of your chosen class hierarchy design.
[...]
What I was trying to explain, (although poorly), or more precisely hinting at, is The Liskov Substitution Principle. This is an Object Oriented design principle that states that any reference to a base class should be replaceable with a reference to a derived class without affecting functionality. What my previous post had attempted to say is that since the TwoDeePoint and ThreeDeePoint classes do not adhere to The Liskov Substitution Principle, the design, in this particular case is flawed.

(For those that don’t remember my “TwoDeePoint” and “ThreeDeePoint” example, check my post, “A Java Gotcha”, where I first get into the #equals/#compareTo issue.)

So, is it invalid for a 3DPoint to inherit from a 2DPoint? The general way this is phrased is “is it legitimate to say 3DPoint is-a 2DPoint”? Well, that depends on what your definition of “is-a” is.

After all, the semantics behind a 3DPoint imply it is a 2DPoint, as it has all the same qualities of a 2DPoint: it simply adds the idea of a third dimension to the existing definition. Semantically, a three dimensional point is a two dimensional point residing on a particular point in the three-dimensional axis, and it has all the capabilities of a two dimensional point on that particular plane.

On the other hand, the Liskov Substitution Principle gives a much stricter definition of “is-a”. I’d like to look at a colloquial definition of the principle, and then a formal definition, because there is a huge gulf between these things.

Colloquially, it basically says that Foo should not inherit from Bar unless Foo is-a-specialization-of Bar: that is, Foo is a limitation upon Bar. So the inheritance in that example is invalid: 3DPoint is not taking 2DPoint and limiting it, but it’s taking 2DPoint and extending it.

Adopting this approach to inheritance will substantially limit its use, and is going to lead to some repetitive code — in my example, all the 2DPoint methods would have to be re-implemented as 3DPoint methods. But it does side-step the #equals/#compareTo problem, in that a specialization will have to rely on the same #equals/#compareTo definition as its parent class.

And that’s where the rub is. Using the cite by way of Wikipedia, let’s look at the formal definition of this principle:

Let q(x) be a property provable about objects x of type T. Then q(y) should be true for objects y of type S where S is a subtype of T.

My rub is with the phrase “a property provable about”. Consider a type of a matrix (M) and a diagonally symmetrical matrix (S). Specifically, there exists M#setXY that sets the item at [X,Y] to a vaue. In S#setXY, we override it so that it also sets the item at [Y,X] to the same value as [X,Y] was set to (thereby maintaining symmetry). Now, S is a colloquial specialization of M, but it could be a violation of the substution principle, since it changes the postcondition of M#setXY: is the fact that the rest of the matrix remains unchanged in M#setXY constitute “a provable property”?

Assuming that “a provable property” is basically analogous to the precondition/postcondition properties, I suppose there is no reason that M#setXY has to specify that the rest of the matrix remains unchanged. What we would basically be arguing in this case is a very liberal interpretation of the role of preconditions/postconditions: the postconditions of M#setXY need not specify everything that M#setXY does, and the preconditions of M#setXY might not need be all true.

If this is the case, then calling a method becomes a very, very scary thing. It is now legitimate for me to implement M#setXY to set every item in the matrix to NaN, delete your hard drive, send out 3000 spam mails a second, and make some coffee. As long as M#setXY ends by setting [X,Y] to the given value, it is an acceptable implementation of inheritance.

So, if we deny this liberal interpretation, and then S is not a legitimate case for inheritance of M, because S#setXY does something which M#setXY doesn’t specify. So the specialization we’re talking about by way of the substitution princple is extremely limiting: basically, all you can do is throw a different sub-class of already-declared exceptions, return subclasses of already-declared return values, and add methods which do not intersect/interefere with the provable properties of any other method, including #equals/#compareTo.

This limits inheritance to very few cases. Certainly, the implementations of abstract class are allowed, assuming that the abstract class leaves enough breathing room (e.g. vague enough declared exceptions) for an underlying implementation. This can be seen, for instance, in Java’s AbstractCollection. Anything which tacks on a piece of non-identifying information or non-interfering functionality works just fine, too, although I’m having trouble coming up with a good example for either of those.

Some people older and wiser than me once asserted that no inheritance should be done on anything other than classes explicitly defined for inheritance (i.e. abstract classes). At the time, I thought that was extreme, and so I discounted the position. I’m increasingly starting to understand it, though, and I think I may be signing on.

Tags: ,

  • Brian

    You still hit the problem if you change your implementation so that a 2D point “is a” 3D point. Anything you do with a 3D point you can do with a 2D point, by assuming Z = 0.

    The problem is that there are operations- and compare is one of them, that need information “outside” the class- information that is not defined yet when you define the class. And your design has to be aware of this.

    In other words, don’t use compare, use a Comparator.

  • http://www.linkedin.com/in/robertfischer Robert Fischer

    What’s “the problem” you hit if you treat 2D point as a specialization of a 3D point?

  • Bruno De Fraine

    During my university career, I received an OO course from a professor that was very inspired by the Eiffel language and the design by contract approach. We defined the Liskov Substitution Principle as the property that a subtype should have a contract stronger than (the combination of) the contract(s) of it(s) supertype(s). This is a necessary property to have sound OO polymorphism, i.e. to be able to use a value of the subtype where a supertype is expected without breaking assumptions.

    If you rigorously define contracts for your classes, this will often mean that superclasses need to have a contract that is underspecified (non-deterministic) and/or that foresee the functionality of the subclasses. In your example, the postcondition of M#setXY(x,y,v) could be underspecified and just say: “it holds that M#getXY(x,y) == v and M#getXY(x’,y’) where (x’,y’) (x,y) is undefined”. Or it could foresee the possibility of both a symmetrical and regular matrix by saying: “it holds that M#getXY(x,y) == v, and forall x’,y’ where (x,y) (x’,y’) it holds that M#getXY(x’,y’) == oldM#getXY(x’,y’) except when (x’,y’) == (y,x) where it holds that either M#getXY(x’,y’) == oldM#getXY(x’,y’) or M#getXY(x’,y’) == v”.

    To choose between the two versions is the trade-off between reusability and more guarantees for the clients: the second version of the matrix contract will only allow either a symmetrical matrix or a general matrix as subclasses, and no other, while the first version does, but its clients cannot make any assumptions about the other values of M after calling setXY.

    Note 1: It is assumed that the things that you do not specify in your postcondition keep their old state, so it is not possible for M#setXY to make coffee.
    Note 2: This school of OO tends to allow inheritance only of abstract superclasses, since they are the only ones with a contract that can be further specified.

    I haven’t read your article about the equals() method, but generally there is only a single well-defined type of equality for two objects of exactly the same type. For example, is 3D point (1,2,0) equal to 2D point (1,2)? Only if you define equality as having the same position in the 3D space, with 2D points positioned in the plane Z=0. But there are other equality relations as well. The equals() method forces you to use a single equality relation, but which one that should be is almost a philosophical discussion.

  • http://www.linkedin.com/in/robertfischer Robert Fischer

    The quality relationship between 2D point and 3D point is a good example: for instance, is a 2D point still a point in 3D? It could very well be a line, representing all the points (x,y,z) where (x,y) == the 2D point. This is the equality which would result if you had the 3D point delegate to the 2D point’s equal method when dealing with 2D points.

    I’m increasingly convinced by Brian’s assertion that having a global “#equals(Object)” method is simply a bad idea for anything but the most academic cases. Sometimes you can define that method on subsets of the object hierarchy (numbers, for instance), but that’s not even near the most common case.

    Java whitewashes some of these problems by providing a largely worthless definition of “#equals(Object)” (referential equality) as the default one, but the problem is definitely alive and well.

  • http://www.linkedin.com/in/robertfischer Robert Fischer

    Fixed the links in this post, BTW.

  • Pingback: Enfranchised Mind » JConch 0.3 Released

  • Pingback: Enfranchised Mind » 7 Actually Useful Things You Didn’t Know Static Typing Could Do: An Introduction for the Dynamic Language Enthusiast

  • http://hamletdarcy.blogspot.com Hamlet D’Arcy

    I know you’ve seen this, but for any future readers:
    http://weblog.raganwald.com/2008/04/is-strictly-equivalent-to.html

    The fact that Scott Myers identified 12 different types of inheritance makes it very difficult to talk about inheritance in general while supplying specific equals() implementations.

    I like Brian’s suggestion best… there are many different scenarios where you may use something like 3D and 2D points. Baking equality into the objects commits to a very concrete implementation of what equality is. Best to externalize that decision in a Comparator.

  • Pingback: Upped The Recent Post/Popular Post Widget Count | Enfranchised Mind

  • Pingback: 7 Actually Useful Things You Didn’t Know Static Typing Could Do: An Introduction for the Dynamic Language Enthusiast