Entity and Identity
I've known for quite some time that better minds than mine have gone astray
in the attempt to overcome the object relational mismatch. Just last week,
I ran across an article that outlines the O-R mapping problem better than I
ever could.
The article is called "The Vietnam of Computer Science", and here's a
pointer:
http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
The article devotes rather too much time to exploring the analogy between
the Vietnam war and ORM attempts. And as the article admits, all analogies
eventually fail. Leaving that aside, I think the survey of problems
encountered in crossing the divide is excellent.
I want to draw particular attention to a heading entitled "Entity Identity
Issues". Reading this section gave me a better understanding of the
disconnect between me and Brian Seltzer over matters concerning entity and
identity. My own view of identity is colored by my own experience. And
that experience includes some practical work with relational databases,
preceded by a little formal learning in that area and 20 years of work as a
programmer. Unfortunately, none of that work included object oriented
programming.
Anyway, my view of identity (or of identification, if you prefer) is that an
object's state is all we have to go on as the basis for identification. In
particular, an object's location (as specified by a pointer) or its
trajectory (a history of pointers over time) are unavailable for purposes of
identification. This view of identity fits pretty comfortably into the
relional model, but it runs afoul of object oriented thinking at least two
important ways. Frst, if an object can conceal part of its state
(encapsulation), then it necessarily can conceal some of what needs to be
known to identify it. Second, if two objects are identical in state, then
they are the same object, even if they differ in location (at the same point
in time). I'll call this the "Doppelganger effect".
When I see an SQL table with two different rows in one table that cannot be
distinguished by their contents, my reaction is that the database designer
made a mistake. Failing in that, the database updaters should have been
more careful. Cases where duplication is intentional and carries
significant information strike me as a misuse of SQL, and a misunderstanding
of the relational model.
The above doesn't pretend to explain Brian's view. But I think it sheds a
little light on why I see things the way I do.
Again, I recommend the article cited above.
Received on Mon Jul 20 2009 - 17:51:08 CEST
Original text of this message