Re: Another view on analysis and ER

From: Brian Selzer <>
Date: Wed, 05 Dec 2007 17:09:37 GMT
Message-ID: <lXA5j.28161$>

"Bob Badour" <> wrote in message news:4756d5e5$0$5261$
> David Cressey wrote:
>> "Jon Heggland" <> wrote in message
>> news:fj6737$o2p$
>>>Quoth David Cressey:
>>>>Here's a website I stmbled across:
>>>>Note that, at the start of the introduction, the author says that
>> analysis
>>>>is the most important part of any project. That's rather different from
>> the
>>>>impression I've gotten in response to my topic on "what is analysis".
>>>Well, that depends on what analysis is. It seems this guy thinks it's
>>>the same as data modeling, which in turn is the same as developing a
>>>graphical representation of the client's needs and processes. Is it?
>>>Furthermore, you could interpret Marshall's and my response as "we don't
>>>do analysis, we just start coding", but I don't think that's what we
>>>mean. Myself, I'm skeptical of presenting analysis as a very separate,
>>>distinct kind of activity, defined by the kinds of artifacts it
>>>produces, i.e. "pretty pictures" to use Bob's term.
>> Not all modeling is analysis. Some of it is design. In particular, I'm
>> going to claim that you discover attributes, but you design relvars.
>> I've
>> already have the second claim confirmed by Bob and others.
>> Bob's distaste for pretty pictures should not obscure the mian theme. A
>> model isn't a "pretty picture" as such. Rather, a "pretty picture" is
>> the
>> projection of a model on a flat screen. Other projections have been
>> proposed. A table written on a whiteboard, with some imaginary sample
>> data
>> written into it, proposed by another participant, is another projection
>> of
>> a model on a flat screen.
>> Whether a pretty picture was worth the cost of making it depends on what
>> happens next.
> Pretty pictures have subtle pitfalls and limiting characteristics.
> Learning to think without them and to communicate without them improves
> both thought and communication.

A picture is worth a thousand words. Only an idiot would try to use a screwdriver to drive a nail. Only a lunatic would choose to use a screwdriver when there is a perfectly good hammer available.

>>>But I digress. This was what I meant to respond to:
>>>>By the way, I don't like the author's dialect of ER. In particular,
>> his
>>>>topic on "resolving many-to-many relationships" is, I believe
>> extraneous
>>>>to ER. His reification of a "watering" reminds me of the term
>> "association
>>>>entity" that someone wrote in reposnse to me a few days ago.
>>>>In analysis, there is nothing to resolve in a many-to-many
>> relationship.
>>>>You only have to resolve it when you are designing relational tables or
>>>Both yes and no. Reifying relationships can be helpful, but /not/
>>>because "Many-to-many relationships cannot be directly converted into
>>>database tables and relationships". The point is rather to make it
>>>easier to discover their properties---their attributes, mainly, but
>>>potentially also other things, e.g. constraints. When I discover a
>>>many-to-many-relationship, I usually make it a box, with a name, and ask
>>>if there is anything else we want to be able to say about this thing.
>>>Often, there is. If there isn't, I can demote it to a line again.
>>>This mainly applies to many-to-many-relationships, because business
>>>rules / attributes / constraints regarding a one-to-many-relationship
>>>are often better relegated to the entity on the many-side (though not
>>>always, of course). It has little to do with the implementation (or
>>>design?) of many-to-many-relationships in relational databases.
>>>Some might argue that reifying relationships is unnecessary, since
>>>relationships in "good" E/R dialects can have attributes. What, then, is
>>>the difference between an entity and a relationship?
>> If you look at the metadata in the implemented database, none.
> Where do you have to look to find any difference? (Other than one is drawn
> as a box and the other as a line or diamond.)
>>>The best answer I
>>>can think of is that an entity is identified by itself, while a
>>>relationship is identified by its entities. But what if something has
>>>more than one way of identification (i.e. multiple keys)? This is where
>>>classic E/R breaks down for me. A "relationship" may be identified by
>>>its entities, but also by (say) just one of its entities in combination
>>>with a subset of its attributes. And/or perhaps a subset of its
>>>attributes, disregarding any entities. Is it then a relationship, a weak
>>>entity, or an entity?
>>>This is turning into a rant against the classic(?) E/R notation, but
>>>here goes anyway. I think it's a bad idea that more than one kind of
>>>thing can have attributes. I think it's a bad idea that there are two
>>>(or more) different ways of indicating how something is identified.
>>>Relationship diamonds are required for non-binary relationships, but are
>>>just clutter for binary ones---bad idea.
>>>Fortunately, there is (at least) one E/R dialect that resolves all these
>>>issues, and in so doing, even makes the distinction between entities and
>>>relationships far less important.
>>>Apropos this distinction: As to whether marriage is a relationship or an
>>>entity, you said that one should listen to the subject matter experts. I
>>>have never had such an expert say to me, "No, that's not a relationship,
>>>that's an entity!" or vice versa. Have you?
>> Not in so many words. But they have said things like "a reservation for
>> a
>> certain car, on a certain date, by a certain customer has a way of
>> identifying it. We call it a 'reservation number'. What you have now
>> learned is that the UofD people think of a reservation as a thing in and
>> of
>> itself and not just an association between a customer and a car on some
>> future date.
>> This tells you something you need to know about the problem statement:
>> The
>> database has to store reservation numbers.
>> It also tells you something you need to know about database design: you
>> have two candidate keys for identifying a relationship, and eventually,
>> a
>> relvar. One is reservation number. The other is customer ID, car
>> type,
>> and date. If you declare primary keys in your database, you need to
>> pick
>> one of these.
> Protecting the integrity of data is a primary goal of data management. If
> one wants to manage one's data, one must declare all candidate keys.
> Whether one needs to pick one to designate as primary is secondary to
> this.
> This could have consequences for performance, ease of
>> programming, "natural joins" etc. etc.
> Performance is independent of choices at the logical level of discourse
> where one identifies candidate keys or designates primary keys.
> Performance is only affected at the physical level of discourse.
> You also need to anticipate that
>> the application programmers are going to want to be able to find a
>> reservation, or the absence of a reservation (CWA), based on the
>> reservation number, based on a slip of paper the customer hands the
>> clerk,
>> or based on the customer, the car type, and the date.
>> In some cases, the business rules will make the design decision for you.
>> In
>> other cases, the business rules are silent on this score.
> I disagree. First, what the application programmers want is irrelevant.
> They are paid to meet the needs of the organization not their own whim.
> Second, business rules are essentially synonymous with what the
> organization needs.
Received on Wed Dec 05 2007 - 18:09:37 CET

Original text of this message