Re: Another view on analysis and ER

From: Jon Heggland <jon.heggland_at_ntnu.no>
Date: Wed, 05 Dec 2007 13:54:42 +0100
Message-ID: <fj6737$o2p$1_at_orkan.itea.ntnu.no>


Quoth David Cressey:
> Here's a website I stmbled across:
>
> http://www.islandnet.com/~tmc/html/articles/datamodl.htm
>
> Note that, at the start of the introduction, the author says that analysis
> is the most important part of any project. That's rather different from the
> impression I've gotten in response to my topic on "what is analysis".

Well, that depends on what analysis is. It seems this guy thinks it's the same as data modeling, which in turn is the same as developing a graphical representation of the client's needs and processes. Is it? Furthermore, you could interpret Marshall's and my response as "we don't do analysis, we just start coding", but I don't think that's what we mean. Myself, I'm skeptical of presenting analysis as a very separate, distinct kind of activity, defined by the kinds of artifacts it produces, i.e. "pretty pictures" to use Bob's term.

But I digress. This was what I meant to respond to:

> By the way, I don't like the author's dialect of ER. In particular, his
> topic on "resolving many-to-many relationships" is, I believe extraneous
> to ER. His reification of a "watering" reminds me of the term "association
> entity" that someone wrote in reposnse to me a few days ago.
>
> In analysis, there is nothing to resolve in a many-to-many relationship.
> You only have to resolve it when you are designing relational tables or
> relvars.

Both yes and no. Reifying relationships can be helpful, but /not/ because "Many-to-many relationships cannot be directly converted into database tables and relationships". The point is rather to make it easier to discover their properties---their attributes, mainly, but potentially also other things, e.g. constraints. When I discover a many-to-many-relationship, I usually make it a box, with a name, and ask if there is anything else we want to be able to say about this thing. Often, there is. If there isn't, I can demote it to a line again.

This mainly applies to many-to-many-relationships, because business rules / attributes / constraints regarding a one-to-many-relationship are often better relegated to the entity on the many-side (though not always, of course). It has little to do with the implementation (or design?) of many-to-many-relationships in relational databases.

Some might argue that reifying relationships is unnecessary, since relationships in "good" E/R dialects can have attributes. What, then, is the difference between an entity and a relationship? The best answer I can think of is that an entity is identified by itself, while a relationship is identified by its entities. But what if something has more than one way of identification (i.e. multiple keys)? This is where classic E/R breaks down for me. A "relationship" may be identified by its entities, but also by (say) just one of its entities in combination with a subset of its attributes. And/or perhaps a subset of its attributes, disregarding any entities. Is it then a relationship, a weak entity, or an entity?

This is turning into a rant against the classic(?) E/R notation, but here goes anyway. I think it's a bad idea that more than one kind of thing can have attributes. I think it's a bad idea that there are two (or more) different ways of indicating how something is identified. Relationship diamonds are required for non-binary relationships, but are just clutter for binary ones---bad idea.

Fortunately, there is (at least) one E/R dialect that resolves all these issues, and in so doing, even makes the distinction between entities and relationships far less important.

Apropos this distinction: As to whether marriage is a relationship or an entity, you said that one should listen to the subject matter experts. I have never had such an expert say to me, "No, that's not a relationship, that's an entity!" or vice versa. Have you?

-- 
Jon
Received on Wed Dec 05 2007 - 13:54:42 CET

Original text of this message