Re: two nasty schemata, union types and surrogate keys

From: Brian <brian_at_selzer-software.com>
Date: Sat, 26 Sep 2009 15:18:18 -0700 (PDT)
Message-ID: <af254400-eef5-4fb1-83d6-bd5fa601a951_at_g23g2000vbr.googlegroups.com>



On Sep 26, 11:55 am, "Walter Mitty" <wami..._at_verizon.net> wrote:
> "Sampo Syreeni" <de..._at_iki.fi> wrote in message
>
> news:49c52f50-acaf-44e6-b8f7-a88574e0a7d6_at_a6g2000vbp.googlegroups.com...
>
> > Still, to return to my original point, what do y'all think about the
> > encoding of facts such as "yes, there are two separate persons called
> > John Smith, and no, we don't have any more information about them as
> > persons, yet given their sets of cars owned, the one with a Ferrari
> > and the one with a Lamborghini are two different persons"?
>
> > I mean, normalization often calls for encoding the persons and their
> > ownership of cars in two separate relations. The first of which
> > seemingly cannot have a natural key. As such, a surrogate key would be
> > required, and no real alternative would be possible.
>
> Normalization has very little to do with the semantics of the data.  If
> you'll look carefully at all the examples of converting from one normal form
> to the next higher form, you'll see that the schema before conversion and
> the one after conversion are equivalent in their capacity to express facts.
> That's important.

And it's wrong! Normalization has everything to do with the semantics of the data. If you look carefully at all of the examples of converting from one normal form to the next higher form, you'll see that the schema before conversion and the one after conversion are NOT equivalent in their capacity to express facts. Instead, the schema after conversion has at least the same capacity to express facts, but not exactly the same capacity. For example, a 2NF schema that is not in 3NF,

SupplierParts {Part Number, Supplier, Supplier Phone Number}

    KEY(Part Number)

that satisfies the transitive functional depencency,

Part Number --> Supplier --> Supplier Phone Number

Does not have the same capacity to express facts as the third normal form database scheme

Suppliers {Supplier, Supplier Phone Number}

    KEY(Supplier),
Parts {Part Number, Supplier}

    KEY(Part Number),
Parts[Supplier] IN Suppliers[Supplier]

In the 3NF scheme, it is possible for the fact that a supplier has a particular phone number to be expressed even if the supplier doesn't at present supply any parts. Every fact that can be expressed in the 2NF schema can also be expressed in the 3NF scheme, but not every fact in the 3NF scheme can be expressed in the 2NF schema.

>
> Contrary to popular belief, adding a surrogate key to each table does not
> change the level of normalization of a schema.  Candidate keys remain
> candidate keys after a surrogate has been added and declared the primary
> key.  I'm not sure, but I think you're asserting the opposite above.

I don't think this rings true either. While it is true that candidate keys remain candidate keys after a surrogate has been added, I think the level of normalization is affected. There may not be a name for a normal form in which no relation that has more than one key has any dependent attributes, but it would certainly be beneficial for a number of reasons. First of all, there could be no nontrivial transitive dependencies, where a trivial transitive dependency is one in which the determinant is transitively dependent on itself. Also, there can only ever be at most one irreducible determinant for each non-prime attribute. Moreover, by isolating the interrelationships between keys to their own relations, it makes it possible for just one of those keys to be used for referential integrity, simplifying the graph of inclusion dependencies.

>
> How do you know that there are two sets of cars owned,  with each set owned
> by a different person,  instead of one set consisting of two cars and with
> one owner?  Is that fact expressed in the database?  How?

If you add a surrogate, then the fact that the surrogates are different is sufficient to determine that there is more than one person. Received on Sat Sep 26 2009 - 17:18:18 CDT

Original text of this message