Re: Another view on analysis and ER

From: Jan Hidders <>
Date: Wed, 5 Dec 2007 02:49:50 -0800 (PST)
Message-ID: <>

On 5 dec, 02:10, JOG <> wrote:
> On Dec 4, 8:33 pm, "David Cressey" <> wrote:
> > "JOG" <> wrote in message
> >
> > > Genuine question guys. From an E/R perspective (one of the good
> > > variants, that allows relationships to have attributes), if I'm faced
> > > with the following data.
> > > -- Fred married Wilma in Bedrock.
> > > -- Barney and Betty married in Paris.
> > > How do I decide whether I am dealing with a marriage entity or a
> > > marriage relationship?
> > > The literature I'm reading here is telling me that the choice is based
> > > on what 'things' are key to the business. If my business is concerned
> > > with people (tax collection say), 'marriage' is best modelled as a
> > > relationship, whereas if the marriages themselves are my focus
> > > (perhaps I run a church) then its probably better as an entity.
> > > Have I made the right interpretation here, and is there general
> > > agreement here? I am much more comfortable seeing that some variants
> > > allow relationships to themselves have attributes, and that there is
> > > nothing sacred about choices between using relationships or entities,
> > > making it a design decision instead.
> > > Thanks in advance, J.
> > My answer is that it's subjective. If the subject matter experts all treat
> > a marriage as a relationship, follow their lead. If the subject matter
> > experts all treat it as an entity, follow their lead, but insist that
> > there's a key attroubt that identifies it. (Do the SME's have any such
> > thing as a "marriage ID" attribute?
> > Once you switch over from analysis to design, here's what happens: the
> > attributes that you discovered during analyisis and attached to entities or
> > relationships will be carried over from your ER model your design model,
> > which I presume will be a relational model.
> > The entities and relationships themselves, as such, will all disappear!
> > Another thing that will carry over is the keys, used to identify instances
> > of entities in the subject matter (or UoD if you prefer). Sometimes the
> > keys used by the SME (subject matter experts) are a little too informal and
> > require "common sense" to disambiguate. That's a special case, and doesn't
> > affect this discussion.
> > I'm used to expressing a relational model in terms of tables (relational
> > tables), but I presume that what follows could be transliterated into terms
> > of relvars without any difficulty.
> > Each entity will have a table of its own, with the attributes that pertain
> > to that entity. The primary key of the table will be the key attribute of
> > the entity.
> > Each relationship will have a table of its own, except for a few that can
> > be piggy backed onto entity tables. The primary key of a relational table
> > will consists of two or more foreign keys, compound. These tables will
> > automatically be normalized up to 3NF unless your analysis put an attribute
> > on the "wrong entity". I'm not sure about normalization beyond 3NF.
> > The above process is so automatic that you can have software that does it
> > for you. Indeed, that's what several tools do. They express an ER model
> > in terms of metadata, and likewise express the relational model in terms of
> > metadata. And they have a programmed process that will create a relational
> > model from an ER model. The only tool I know calls the relaional model a
> > "physical model" and makes it specific to some product like Oracle or DB2,
> > etc. But that's a trivial detail. The software also turns models into
> > diagrams and/or create scripts for you.
> > So where did the entities and relationships go? They disappeared into the
> > ether! However,
> > when the application people get around to designing screens and reports,
> > they can tie each feature back to an original entity or relationship. That
> > can make the resulting system coherent for the users.
> Ok so one might summarize the following steps:
> 1) initial analysis of business processes and important concepts.
> 2) Formulation of an initial conceptual model (that is necessarily
> slanted to a certain viewpoint of the UoD).

You might want to add here that sometimes in this step you have to integrate several different UoDs that do not necessarily agree on the overlapping parts. I think that is mainly what Ruud is talking about, plus the fact that at this stage you might want to anticipate a bit on other UoDs that might have to be integrated in the future. You could call that "making it more objective" but I think "making it less subjective" is more precise. ;-)

> 3) Translation into a nicely normalized logical model, that's query
> neutral.

Normalization is really only a very minor issue here IMO. I've not had that much personal practical experience in my life but I did work briefly for two big Dutch companies that both had an organization in charge of maintaining the global company data model that integrated all data models from the applications and databases they had. I worked with the guys that did this, and I remember being completely blown away buy how much variation there was in concepts such as employee and order, even within a single company. I still admire these guys.

> 4) On demand, extract data back out from the neutral logical model,
> shaping it either the original conceptual view, or other conceptual
> views as needs arise from new applications.
> Great. This all makes perfect sense, and is very clear to boot. A
> simple process for creating a thorough yet flexible system. It seems
> obvious even, right?
> So why on earth would /anyone/ want to drop step 3? I'm at a loss as
> to why certain cdt'ers (who are clearly intelligent people) seem to be
> advocating this. An absolute loss I tell you.

I'm not sure that is what Ruud is saying. Anyone else?

  • Jan Hidders
Received on Wed Dec 05 2007 - 11:49:50 CET

Original text of this message