Entity and Identity

From: Walter Mitty <wamitty_at_verizon.net>
Date: Mon, 20 Jul 2009 15:51:08 GMT
Message-ID: <Mn09m.190$646.6_at_nwrddc01.gnilink.net>

I've known for quite some time that better minds than mine have gone astray in the attempt to overcome the object relational mismatch. Just last week, I ran across an article that outlines the O-R mapping problem better than I ever could.

The article is called "The Vietnam of Computer Science", and here's a pointer:


The article devotes rather too much time to exploring the analogy between the Vietnam war and ORM attempts. And as the article admits, all analogies eventually fail. Leaving that aside, I think the survey of problems encountered in crossing the divide is excellent.

I want to draw particular attention to a heading entitled "Entity Identity Issues". Reading this section gave me a better understanding of the disconnect between me and Brian Seltzer over matters concerning entity and identity. My own view of identity is colored by my own experience. And that experience includes some practical work with relational databases, preceded by a little formal learning in that area and 20 years of work as a programmer. Unfortunately, none of that work included object oriented programming.

Anyway, my view of identity (or of identification, if you prefer) is that an object's state is all we have to go on as the basis for identification. In particular, an object's location (as specified by a pointer) or its trajectory (a history of pointers over time) are unavailable for purposes of identification. This view of identity fits pretty comfortably into the relional model, but it runs afoul of object oriented thinking at least two important ways. Frst, if an object can conceal part of its state (encapsulation), then it necessarily can conceal some of what needs to be known to identify it. Second, if two objects are identical in state, then they are the same object, even if they differ in location (at the same point in time). I'll call this the "Doppelganger effect".

When I see an SQL table with two different rows in one table that cannot be distinguished by their contents, my reaction is that the database designer made a mistake. Failing in that, the database updaters should have been more careful. Cases where duplication is intentional and carries significant information strike me as a misuse of SQL, and a misunderstanding of the relational model.

The above doesn't pretend to explain Brian's view. But I think it sheds a little light on why I see things the way I do.

Again, I recommend the article cited above. Received on Mon Jul 20 2009 - 17:51:08 CEST

Original text of this message