Re: Another view on analysis and ER

From: Ruud de Koter <>
Date: Fri, 07 Dec 2007 22:59:32 +0100
Message-ID: <4759c23c$0$243$>

JOG wrote:

> On Dec 5, 6:09 am, Ruud de Koter <> wrote:
>> JOG wrote:
>>> On Dec 4, 11:42 pm, Ruud de Koter <> wrote:
>>>> David Cressey wrote:

>>>>> "JOG" <> wrote in message
>>>>>> Genuine question guys. From an E/R perspective (one of the good
>>>>>> variants, that allows relationships to have attributes), if I'm faced
>>>>>> with the following data.
>>>>>> -- Fred married Wilma in Bedrock.
>>>>>> -- Barney and Betty married in Paris.
>>>>>> How do I decide whether I am dealing with a marriage entity or a
>>>>>> marriage relationship?
>>>>>> The literature I'm reading here is telling me that the choice is based
>>>>>> on what 'things' are key to the business. If my business is concerned
>>>>>> with people (tax collection say), 'marriage' is best modelled as a
>>>>>> relationship, whereas if the marriages themselves are my focus
>>>>>> (perhaps I run a church) then its probably better as an entity.
>>>>>> Have I made the right interpretation here, and is there general
>>>>>> agreement here? I am much more comfortable seeing that some variants
>>>>>> allow relationships to themselves have attributes, and that there is
>>>>>> nothing sacred about choices between using relationships or entities,
>>>>>> making it a design decision instead.
>>>>>> Thanks in advance, J.

>>>>> My answer is that it's subjective. If the subject matter experts all treat
>>>>> a marriage as a relationship, follow their lead. If the subject matter
>>>>> experts all treat it as an entity, follow their lead, but insist that
>>>>> there's a key attroubt that identifies it. (Do the SME's have any such
>>>>> thing as a "marriage ID" attribute?
>>>>> Once you switch over from analysis to design, here's what happens: the
>>>>> attributes that you discovered during analyisis and attached to entities or
>>>>> relationships will be carried over from your ER model your design model,
>>>>> which I presume will be a relational model.
>>>>> The entities and relationships themselves, as such, will all disappear!
>>>>> Another thing that will carry over is the keys, used to identify instances
>>>>> of entities in the subject matter (or UoD if you prefer). Sometimes the
>>>>> keys used by the SME (subject matter experts) are a little too informal and
>>>>> require "common sense" to disambiguate. That's a special case, and doesn't
>>>>> affect this discussion.
>>>>> I'm used to expressing a relational model in terms of tables (relational
>>>>> tables), but I presume that what follows could be transliterated into terms
>>>>> of relvars without any difficulty.
>>>>> Each entity will have a table of its own, with the attributes that pertain
>>>>> to that entity. The primary key of the table will be the key attribute of
>>>>> the entity.
>>>>> Each relationship will have a table of its own, except for a few that can
>>>>> be piggy backed onto entity tables. The primary key of a relational table
>>>>> will consists of two or more foreign keys, compound. These tables will
>>>>> automatically be normalized up to 3NF unless your analysis put an attribute
>>>>> on the "wrong entity". I'm not sure about normalization beyond 3NF.
>>>>> The above process is so automatic that you can have software that does it
>>>>> for you. Indeed, that's what several tools do. They express an ER model
>>>>> in terms of metadata, and likewise express the relational model in terms of
>>>>> metadata. And they have a programmed process that will create a relational
>>>>> model from an ER model. The only tool I know calls the relaional model a
>>>>> "physical model" and makes it specific to some product like Oracle or DB2,
>>>>> etc. But that's a trivial detail. The software also turns models into
>>>>> diagrams and/or create scripts for you.
>>>>> So where did the entities and relationships go? They disappeared into the
>>>>> ether! However,
>>>>> when the application people get around to designing screens and reports,
>>>>> they can tie each feature back to an original entity or relationship. That
>>>>> can make the resulting system coherent for the users.
>>>>> I apologize for this repsonse. It's really a lot more than you asked for.
>>>>> But it's actually easier to use than it is to describe. It may also
>>>>> oversimplify. The relational model that software constructs may not be the
>>>>> best relational model that could be designed to deal with the original
>>>>> problem.
>>>> Very clear, this answer. One minor point I 'd like to add is that it is
>>>> not subjective. Instead, the choices are governed by the goal to be
>>>> served with the application (assuming the analysis aims at building an
>>>> application).
>>> Shared data anyone? Isn't the point that we _don't_ necessarily know
>>> all the applications?
>>>> As clearly stated, there is a difference in perspective
>>>> between tax inspectors and priests (and spouses, for that matter). There
>>>> simply is no single authorative model for a marriage, there are several
>>>> points of view, depending on the universe of discourse one operates in.
>>>> What we, in analysis, can do is to make sure we are aware of these UoDs
>>>> , and make a conscious choice. That is something else than being subjective.
>>> Why make the choice? Keep the data neutral and its good for both tax
>>> inspectors and priests right.
>> There are two troublesome points in your reaction. First of all 'keep
>> the data neutral' doesn't mean no choices are made. Staying neutral is a
>> choice as well. One of the hardest choices I 'd say, because in order to
>> stay neutral, a thorough knowledge of the universes of discourse is
>> necessary. Also, these universes should not be mutually exclusive.
> Thats a fair point. Neutrality is something i've promoted for a long
> time on cdt, and I understand there are issues for processor cycles.
> However one can still take a single conceptual view of data in
> analysis and flatten it out in the logical layer.  Take David's
> breakdown of the marriage example for instance - even though it is
> translated from a single conceptual view, by the time it is in the
> logical layer data may be extracted from it via the perspective of
> marriage as an entity, or marriage as a relationship, with equal ease.

Re-reading your posts gave me a better understanding. I see how flattening-out may help to make the model more accessible for other UoDs then the one that was used in the original analysis. Nevertheless, I keep thinking this accessibility is only a surface matter: any attributes that were not part of the analysis will not be in the logical model. To stay with the example: the IRS may get at the marriage data, but it will probably want to know some specific legal information that determines how the spouse's taxes will be dealt with. It will not be that hard to add columns in a relational model to represent these attributes (an obvious and classical bonus of the relational approach).

>> A second point: we can only keep the data neutral if know all possible
>> perspectives. It is only then that we can consciously model the data to
>> fit all the universes of discourse. Yet, you rightly observe we don't
>> necessarily know all the applications, which amounts to saying we don't
>> know all the universes of discourse. So choices can not be avoided. In
>> that case I 'd much rather make these conscious choices instead of
>> keeping up a pretense of neutrality. At the very least we should be
>> aware that the model resulting from analysis may be biased, and is not
>> the final word on the world out there.
> I think maybe we are referring to a slightly different definition of
> neutrality. I'm suggesting that a logical model should have no bias as
> to whether things are relationships or entities, and leave that to be
> determined by the person generating the queries. 

Yes, we are clearly using a different definition of neutrality. I must have missed that. Apparently my background as a political scientist still cuts in from time to tome.


Ruud de Koter.

Regards, J.

>>>> Hope this helps,
>>>> Ruud de Koter.
Received on Fri Dec 07 2007 - 22:59:32 CET

Original text of this message