Re: two nasty schemata, union types and surrogate keys

From: Brian <brian_at_selzer-software.com>
Date: Mon, 5 Oct 2009 14:43:32 -0700 (PDT)
Message-ID: <4c6a3220-88be-4154-afd3-b4908819027b_at_j28g2000vbl.googlegroups.com>


On Oct 5, 1:45 pm, rp..._at_pcwin518.campus.tue.nl (rpost) wrote:
> Brian wrote:
> >> Normalization has very little to do with the semantics of the data.  If
> >> you'll look carefully at all the examples of converting from one normal form
> >> to the next higher form, you'll see that the schema before conversion and
> >> the one after conversion are equivalent in their capacity to express facts.
> >> That's important.
>
> >And it's wrong!  Normalization has everything to do with the semantics
> >of the data.  [...]
> [...]
> >In the 3NF scheme, it is possible for the fact that a supplier has a
> >particular phone number to be expressed even if the supplier doesn't
> >at present supply any parts.  Every fact that can be expressed in the
> >2NF schema can also be expressed in the 3NF scheme, but not every fact
> >in the 3NF scheme can be expressed in the 2NF schema.
>
> This is because you're dropping an inclusion dependency
> when going to the 3NF version.

The 3NF scheme has an inclusion dependency. It is the only one that is a logical consequence of the transitive functional dependency. The inclusion dependency ensures that the transitive functional dependency still holds in the join of the 3NF relations.

> More generally, the reason you're disagreeing is that people seem to
> disagree what exactly it means to normalize or denormalize: they agree
> on what happens to the table structure, but not on what happens to
> constraints.  E.g. the Wikipedia articles on this subject are
> sorely lacking on this matter.
>

I can definitely agree with that.

> >> Contrary to popular belief, adding a surrogate key to each table does not
> >> change the level of normalization of a schema.  Candidate keys remain
> >> candidate keys after a surrogate has been added and declared the primary
> >> key.  I'm not sure, but I think you're asserting the opposite above.
>
> [...]
>
> >> How do you know that there are two sets of cars owned,  with each set owned
> >> by a different person,  instead of one set consisting of two cars and with
> >> one owner?  Is that fact expressed in the database?  How?
>
> >If you add a surrogate, then the fact that the surrogates are
> >different is sufficient to determine that there is more than one
> >person.
>
> Tuples don't represent objects, but statements of fact.

Tuples don't /directly/ represent objects, but given a ground atom, Pabc, what are the symbols a, b and c? They're individual constants, and under an interpetation they represent objects. Moreover, since every relation has a key, the ground atoms represented really must be of a form similar to, Q(f(a,b),c), where f is a function that maps combinations of objects into the universe of discourse. Under an interpretation, the function application, f(a,b) becomes the object mapped so that the formula Q(f(a,b),c) states information about that particular object. Keys, therefore, render your assertion oversimplistic, since every instance of a key maps to an object in the universe of discourse. Even in a relation that has more than one key, there is always a superkey that includes all prime attributes and represents the aggregate of the objects represented by the key instances, which is also an object.

> Different statements about persons may need to be recorded in situations
> where it is not known whether they pertain to the same person.
> E.g. in some kind of observation database (say, incident reporting
> on traffic) this may be necessary.  Doing so doesn't violate
> the closed world assumption.
>

I don't know what you're getting at here.

> --
> Reinier
Received on Mon Oct 05 2009 - 23:43:32 CEST

Original text of this message