Re: Another view on analysis and ER

From: David BL <davidbl_at_iinet.net.au>
Date: Sat, 8 Dec 2007 03:28:03 -0800 (PST)
Message-ID: <d8931aae-f5e0-40c7-aa41-73c195216a2b_at_s19g2000prg.googlegroups.com>


On Dec 7, 9:11 pm, JOG <j..._at_cs.nott.ac.uk> wrote:
> On Dec 7, 12:41 am, David BL <davi..._at_iinet.net.au> wrote:
>
>
>
>
>
> > On Dec 7, 5:36 am, JOG <j..._at_cs.nott.ac.uk> wrote:
> > > On Dec 6, 7:49 pm, David BL <davi..._at_iinet.net.au> wrote:
> > [snip]
> > > , but I'm not sure I agree. Intensional definitions only
> > > refer to rules concerning valid values for predicate variables, not
> > > valid entities. If I've missed a trick there, perhaps its worth an
> > > example?
>
> > An intensional definition should uniquely define a corresponding
> > extension. For example
>
> > predicate:
> > album(N)
>
> > intension:
> > String N is the name of a studio album
> > released before 2003, of the band Garbage
>
> > extension:
> > {
> > {N=Garbage},
> > {N=Version 2.0},
> > {N=beautifulgarbage}
> > }
>
> > An instantiation of the intensional definition is
>
> > String "Version 2.0" is the name of a studio
> > album released before 2003, of the band Garbage
>
> > This natural language sentence refers to a studio album entity in the
> > real world. The name value is distinct from the entity. The
> > existence of the entity is implied by the intensional definition.
>
> Yup, we did have crossed wires - I thought you were referring to
> integrity constraints (which of course are very much part of a
> relation's intension too).
>
>
>
>
>
>
>
> > > > [ker-shhhnip]
> > > > > Summarizing the above I gathered that you are saying that:
> > > > > RELATIONSHIP: married(Husband, Wife, Location)
> > > > > ENTITY: married(MarriageId, Husband, Wife, Location)
> > > > > So the only difference is that the entity has a marriageID? I am not
> > > > > clear why you think the addition of this surrogate would change a
> > > > > relationship into an entity!
>
> > > > In the first case the marriage is a relationship because it is only
> > > > identified indirectly by the entities that are involved. In the
> > > > second case the marriage has been directly identified.
>
> > > Ok, sorry if I'm being dense, but are you saying that the second case
> > > is an entity because it can be identified without reference to another
> > > entity? And would a logical consequence be therefore that no
> > > identifying attributes of an entity may be entities themselves?
>
> > Yes.
>
> O.k., so the next natural question is back to our team entity...
>
> Team(goalkeeper:Jim, defender:David, Midfielder:Jon, Attacker:Bob)
>
> But by the definition you are positing, there is no team entity here
> at all right? Because the identifying attributes are entities
> themselves (assuming people are entities of course)? Hmmm....

Do you think the distinction between directly or indirectly identifying a team is an important consideration when designing the predicates?

According to the characterisation, for your above predicate a "team" is only a relationship, not an entity. The rationale is that it isn't considered appropriate to identify the team because it has a fleeting existence compared to the entities that comprise the team. Note however, that without a direct team identifier we have assumed that every time we change a player we have a *different* team. See how it fits in with the notion of a fleeting existence?

This relates back to the paradox of change that you recently raised with Reinier. There is nothing wrong with the above definition of a team unless it doesn't match our conceptual model. More specifically it can be very useful to allow a team to keep its identity despite players that come and go over time. It's fuzzy, but humans choose to identify entities despite the paradox of change. I think this is the real purpose behind entities in this discussion.

This consideration is very important to the DB schema design. I think the entity-relationship terminology is useful for highlighting the importance of the whole issue.

> > X is an entity in the context of the model if there exists set A of
> > attributes + values that identify X and there doesn't exist a subset
> > of A that identifies a different entity Y.
>
> > Actually this attempted definition isn't quite right. For example you
> > could determine that a team is an entity except for the fact that you
> > end up identifying the team captain as well. Perhaps that problem
> > can be fixed by talking about maximal entity types corresponding to
> > cartesian products of domains and noting that team identifiers aren't
> > suitable for identifying players more generally.
>
> Maximal entity types... er.... sounds like things are getting more
> convoluted. Which brings me to the real question. Whats the point of
> all this? Why not give up on trying to make what I still feel is an
> artificial split between entities and relationships? What is gained by
> that split that makes it worth the effort?

Did I answer that question with the team example?

Perhaps you agree with the significance of choosing whether to directly or indirectly identify a team in the logical design, but don't like to interpret it as an entity / relationship distinction. That may be fair enough. I like Marshall's Occam's razor comment (just write the dam tables).

> As ERM has evolved relationships are looking, more and more like
> entities anyhow. Now they are shapes not lines, now they can have
> attributes too... One more step, make them boxes instead of diamonds,
> and the job is done. What would we have lost then? Might I invoke
> occam's razor and say that a system with one type is preferable to
> two, unless making a split has any real benefits?

I suspect the decision of whether to directly identify a thing in the real world (despite the paradox of change) is so important it's convenient to reserve "entity" for when that happens.

I agree this would be much more palatable if we had a more formal definition of what it means to directly identify a thing.

> > > > That seems
> > > > like a significant difference in the logical layer. It shows up in
> > > > the intensional definitions where a marriage takes on a "role" as an
> > > > entity.
>
> > > > In the second case there must be some underlying reason to introduce a
> > > > marriage identifier. That reason points at a significant difference
> > > > in requirements. Don't assume the marriage identifier is a surrogate
> > > > id! Instead assume this is a well conceived design, and it's a
> > > > natural identifier.
>
> > > Well hey, I don't think a surrogate means a poorly conceived design.
> > > But I take your point, the Marriage ID in this case is coming from
> > > some external source, and hasn't been instigated by the designer of
> > > this modeller.
>
> Jon seems to have covered a lot of comments I might make, so I won't
> replicate his post. Instead I shall drink tea.

You both ask lots of good (and tricky) questions. Cheers. Received on Sat Dec 08 2007 - 12:28:03 CET

Original text of this message