Re: two nasty schemata, union types and surrogate keys

From: Sampo Syreeni <decoy_at_iki.fi>
Date: Fri, 25 Sep 2009 16:33:24 -0700 (PDT)
Message-ID: <49c52f50-acaf-44e6-b8f7-a88574e0a7d6_at_a6g2000vbp.googlegroups.com>


Still, to return to my original point, what do y'all think about the encoding of facts such as "yes, there are two separate persons called John Smith, and no, we don't have any more information about them as persons, yet given their sets of cars owned, the one with a Ferrari and the one with a Lamborghini are two different persons"?

I mean, normalization often calls for encoding the persons and their ownership of cars in two separate relations. The first of which seemingly cannot have a natural key. As such, a surrogate key would be required, and no real alternative would be possible.

Personally I hate surrogates, though I use them for far less pressing purposes (i.e. performance). Still I think that's mostly because we don't really have the kind of true surrogates Codd suggested, where the actual, underlying value is completely hidden. Nor do we have a proper semantics -- Codd or not him -- for what a database with surrogates is supposed to mean, or how to make it safe from update anomalies and the like.

The RDF data model is one rare example where we do have anonymous surrogates, in the form of "blank nodes". And we have a formal semantics as well. But otherwise, sadly, we're then once again into the quagmire that is EAV; or as I'd like to call it, gratuitous reification. Received on Sat Sep 26 2009 - 01:33:24 CEST

Original text of this message