Re: Another view on analysis and ER

From: JOG <>
Date: Wed, 5 Dec 2007 04:24:31 -0800 (PST)
Message-ID: <>

On Dec 5, 6:09 am, Ruud de Koter <> wrote:
> JOG wrote:
> > On Dec 4, 11:42 pm, Ruud de Koter <> wrote:
> >> David Cressey wrote:
> >>> "JOG" <> wrote in message
> >>>

> >>>> Genuine question guys. From an E/R perspective (one of the good
> >>>> variants, that allows relationships to have attributes), if I'm faced
> >>>> with the following data.
> >>>> -- Fred married Wilma in Bedrock.
> >>>> -- Barney and Betty married in Paris.
> >>>> How do I decide whether I am dealing with a marriage entity or a
> >>>> marriage relationship?
> >>>> The literature I'm reading here is telling me that the choice is based
> >>>> on what 'things' are key to the business. If my business is concerned
> >>>> with people (tax collection say), 'marriage' is best modelled as a
> >>>> relationship, whereas if the marriages themselves are my focus
> >>>> (perhaps I run a church) then its probably better as an entity.
> >>>> Have I made the right interpretation here, and is there general
> >>>> agreement here? I am much more comfortable seeing that some variants
> >>>> allow relationships to themselves have attributes, and that there is
> >>>> nothing sacred about choices between using relationships or entities,
> >>>> making it a design decision instead.
> >>>> Thanks in advance, J.
> >>> My answer is that it's subjective. If the subject matter experts all treat
> >>> a marriage as a relationship, follow their lead. If the subject matter
> >>> experts all treat it as an entity, follow their lead, but insist that
> >>> there's a key attroubt that identifies it. (Do the SME's have any such
> >>> thing as a "marriage ID" attribute?
> >>> Once you switch over from analysis to design, here's what happens: the
> >>> attributes that you discovered during analyisis and attached to entities or
> >>> relationships will be carried over from your ER model your design model,
> >>> which I presume will be a relational model.
> >>> The entities and relationships themselves, as such, will all disappear!
> >>> Another thing that will carry over is the keys, used to identify instances
> >>> of entities in the subject matter (or UoD if you prefer). Sometimes the
> >>> keys used by the SME (subject matter experts) are a little too informal and
> >>> require "common sense" to disambiguate. That's a special case, and doesn't
> >>> affect this discussion.
> >>> I'm used to expressing a relational model in terms of tables (relational
> >>> tables), but I presume that what follows could be transliterated into terms
> >>> of relvars without any difficulty.
> >>> Each entity will have a table of its own, with the attributes that pertain
> >>> to that entity. The primary key of the table will be the key attribute of
> >>> the entity.
> >>> Each relationship will have a table of its own, except for a few that can
> >>> be piggy backed onto entity tables. The primary key of a relational table
> >>> will consists of two or more foreign keys, compound. These tables will
> >>> automatically be normalized up to 3NF unless your analysis put an attribute
> >>> on the "wrong entity". I'm not sure about normalization beyond 3NF.
> >>> The above process is so automatic that you can have software that does it
> >>> for you. Indeed, that's what several tools do. They express an ER model
> >>> in terms of metadata, and likewise express the relational model in terms of
> >>> metadata. And they have a programmed process that will create a relational
> >>> model from an ER model. The only tool I know calls the relaional model a
> >>> "physical model" and makes it specific to some product like Oracle or DB2,
> >>> etc. But that's a trivial detail. The software also turns models into
> >>> diagrams and/or create scripts for you.
> >>> So where did the entities and relationships go? They disappeared into the
> >>> ether! However,
> >>> when the application people get around to designing screens and reports,
> >>> they can tie each feature back to an original entity or relationship. That
> >>> can make the resulting system coherent for the users.
> >>> I apologize for this repsonse. It's really a lot more than you asked for.
> >>> But it's actually easier to use than it is to describe. It may also
> >>> oversimplify. The relational model that software constructs may not be the
> >>> best relational model that could be designed to deal with the original
> >>> problem.
> >> Very clear, this answer. One minor point I 'd like to add is that it is
> >> not subjective. Instead, the choices are governed by the goal to be
> >> served with the application (assuming the analysis aims at building an
> >> application).


> > Shared data anyone? Isn't the point that we _don't_ necessarily know
> > all the applications?

> >> As clearly stated, there is a difference in perspective
> >> between tax inspectors and priests (and spouses, for that matter). There
> >> simply is no single authorative model for a marriage, there are several
> >> points of view, depending on the universe of discourse one operates in.
> >> What we, in analysis, can do is to make sure we are aware of these UoDs
> >> , and make a conscious choice. That is something else than being subjective.

> > Why make the choice? Keep the data neutral and its good for both tax
> > inspectors and priests right.

> There are two troublesome points in your reaction. First of all 'keep
> the data neutral' doesn't mean no choices are made. Staying neutral is a
> choice as well. One of the hardest choices I 'd say, because in order to
> stay neutral, a thorough knowledge of the universes of discourse is
> necessary. Also, these universes should not be mutually exclusive.

Thats a fair point. Neutrality is something i've promoted for a long time on cdt, and I understand there are issues for processor cycles. However one can still take a single conceptual view of data in analysis and flatten it out in the logical layer. Take David's breakdown of the marriage example for instance - even though it is translated from a single conceptual view, by the time it is in the logical layer data may be extracted from it via the perspective of marriage as an entity, or marriage as a relationship, with equal ease.


> A second point: we can only keep the data neutral if know all possible
> perspectives. It is only then that we can consciously model the data to
> fit all the universes of discourse. Yet, you rightly observe we don't
> necessarily know all the applications, which amounts to saying we don't
> know all the universes of discourse. So choices can not be avoided. In
> that case I 'd much rather make these conscious choices instead of
> keeping up a pretense of neutrality. At the very least we should be
> aware that the model resulting from analysis may be biased, and is not
> the final word on the world out there.

I think maybe we are referring to a slightly different definition of neutrality. I'm suggesting that a logical model should have no bias as to whether things are relationships or entities, and leave that to be determined by the person generating the queries. Regards, J.


> >> Hope this helps,
> >> Ruud de Koter.
Received on Wed Dec 05 2007 - 13:24:31 CET

Original text of this message