Re: Little design mistakes that can be easily avoided (2): Listenning to CELKO (and CELKO alikes)

From: Matthias Klaey <mpky_at_hotmail.com>
Date: Tue, 29 May 2007 17:09:48 +0200
Message-ID: <aago531m0t3r3bohtcibpi2fs1hnmivakn_at_4ax.com>


"Brian Selzer" <brian_at_selzer-software.com> wrote:

>
>"Matthias Klaey" <mpky_at_hotmail.com> wrote in message
>news:rd6l53tm7fp7d1aju5v335gcgaebji8r0e_at_4ax.com...
>> "Brian Selzer" <brian_at_selzer-software.com> wrote:
>>
>>>
>>>"Matthias Klaey" <mpky_at_hotmail.com> wrote in message
>>>news:5f2k53pq5srphk68s1op58udei9h60stuo_at_4ax.com...
>>>> "Brian Selzer" <brian_at_selzer-software.com> wrote:
>>>>
>>>>>
>>>>>"Matthias Klaey" <mpky_at_hotmail.com> wrote in message
>>>>>news:bipi53pjf51gluul6b3b8uv86kbq7hanse_at_4ax.com...
>>>>>> Cimode <cimode_at_hotmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>Matthias Klaey a écrit :
>>>>>>>> Cimode <cimode_at_hotmail.com> wrote:
>>>>>>>>
>>>>>>>> >On 26 mai, 18:18, Matthias Klaey <m..._at_hotmail.com> wrote:
>>>>>>>> >> Cimode <cim..._at_hotmail.com> wrote:
>>>>>>>> >> >Hi,
>>>>>>
>>>>>> [...]
>>>>>>
>> [...]
>>
>>>>>>
>>>>>
>>>>>If you've studied it, they you know that Celko's failed to mention the
>>>>>last
>>>>>paragraph of section 4, which is in direct contradiction to his rant.
>>>>>I've
>>>>>pointed this and other things out to him before, but he still dumps the
>>>>>same
>>>>>flawed argument on people all the time. In other words, he *knowingly*
>>>>>mischaracterizes Codd's position to advance his own agenda. (He wants
>>>>>people to buy his books!)
>>>>
>> [...]
>>
>>>> Or this one? Paper 2, Section 4 last paragraph
>>>>
>>>> "They will have to remember, however, that it is now the surrogate
>>>> that is the primary key and provides truly permanent identification
>>>> of each entity. The capability of making equi-joins on surrogates
>>>> implies that users see the headings of such columns but not the
>>>> specific values in those columns."
>>>>
>>>> And how, specifically, does any of these paragraphs contradict Celkos
>>>> "rant" (sic!)? Does it contradict the whole of it, or just part of it,
>>>> and if the latter is the case, which part?
>>>>
>>>
>>>This is the one. It directly contradicts the following paragraph in
>>>Celko's
>>>["rant" is a bit more polite than "spew" It certainly doesn't rise to the
>>>level of an argument. The fallacies and deliberate attempt to mislead
>>>place
>>>it more in the category of propaganda than anything else.]:
>>>
>>><<
>>>This means that a surrogate ought to act like an index; created by the
>>>user, managed by the system and NEVER seen by a user. That means
>>>never used in queries, DRI or anything else that a user does.
>>
>>>
>>>Which sums up his entire argument against surrogates. He then uses the
>>>term
>>>"authority" as if in the article Codd was arguing against surrogates, when
>>>the section he quoted (or deliberately misquoted) is a clear argument in
>>>favor of surrogates! Now, while I'm not sure I agree with Codd when it
>>>comes to hiding surrogate key values, I certainly wouldn't twist his words
>>>completely around in order to support my own agenda!
>>
>> I don't think that Celko argues *against* surrogates, in sharp
>> contrast to "exposed physical locators". He clearly makes the
>> distinction between the two types.
>>
>
>If I understand him correctly, Celko states that the values for a surrogate
>key are a function of the other values in a row. Clearly, this does not
>fulfill the requirement that the values provide permanent identification for
>an object, since that value would necessarily change along with the rest of
>the row.
>
>> The only issue I see is about the visibility of surrogate key
>> *headings* in equi-joins. The values have no meaning whatsoever to the
>> user, as s/he cannot use them in Inserts, Updates, or Deletes.
>>
>
>A surrogate key value does indeed have a meaning. It is a symbol that
>represents in the abstract the essence of an object in the universe of
>discourse, that is, that which makes one object different from all other
>objects that have ever existed or will ever exist. The E-domian from RM/T
>encapsulates this metaphysical notion in one place by enumerating
>symbolically the essence of each and every object that can ever be discussed
>(be referenced in a database extension). This means that there is a 1:1
>mapping *independent of time* between the objects under discussion and the
>values in the E-domain. As a result, a value drawn from the E-domain
>identifies an object not only throughout a single extension of the database
>(which is all that a candidate key can guarantee), but throughout all
>possible extensions. Contrast this with the set of values for a composite
>natural key which may only distinguish an object from all others during a
>particular interval in time: if you update a row in a table where the entire
>heading is the key, are you selecting a different object in the universe of
>discourse, or is the new set of values the current state of affairs for the
>same object? If there is a surrogate key, then there is no question: if the
>surrogate key value is different, then a different object has been selected,
>otherwise, the new set of values represents the current state of affairs for
>the same object.
>
>Now, I agree that an end-user may not be able to make sense of this--from
>the content of this and other discussions on this and other newsgroups, it
>is clear that even so-called experts aren't able to grasp the concept, even
>though Codd wrote about it back in 1979. In addition, since surrogate key
>values are symbols, how such symbols are represented is completely
>arbitrary. All that is required is that the representative values be
>distinct. Presenting such abstract values to people who don't have a clue
>what they represent is dangerous. So if "the user" means "the end user,"
>then I agree with Codd that the values should be hidden. On the other hand,
>if "the user" means "the application developer" or "the database designer,"
>then I don't agree. A permanent identifier for each object makes optimistic
>concurrency control possible, ensures consistency when using disconnected
>recordsets and datasets, and simplifies the representation of temporal data
>and the definition of temporal constraints, including transition
>constraints.
>
>A simple solution is to provide the end-users with a set of updatable views
>that do not include surrogate keys in their headings, and through security,
>prevent ad-hoc access to the base tables. It's the best of both worlds,
>since the end-users never see the surrogate key values, whereas the
>applications and constraints that expect a key value to identify the same
>object at different points in time do.

Ok, thank you for this clarification. This makes a lot of sense to me. So Celkos description of surrogate keys seems to be at least incomplete.

Greetings
Matthias Kläy

-- 
www.kcc.ch
Received on Tue May 29 2007 - 17:09:48 CEST

Original text of this message