Re: Little design mistakes that can be easily avoided (2): Listenning to CELKO (and CELKO alikes)

From: Matthias Klaey <mpky_at_hotmail.com>
Date: Sun, 27 May 2007 00:06:55 +0200
Message-ID: <8r3h53lbeanv0strvtkmhhj5s6csi7e6b0_at_4ax.com>


Cimode <cimode_at_hotmail.com> wrote:

>On 26 mai, 18:18, Matthias Klaey <m..._at_hotmail.com> wrote:
>> Cimode <cim..._at_hotmail.com> wrote:
>> >Hi,
>>
>> >Chapter 2 of the little things that can be done at little cost to
>> >avoid misconceptions about RM design.
>>
>> >Lately, I opened a thread that gave opportunity for CELKO to post the
>> >exact essence of what should *not* be done in design: let
>> >subjectiveness regulate design instead of logic. I take the
>> >opportunity of his thread to point out how and why it is wiser to
>> >ignore such ideas to build better databases. For that I will take all
>> >the posted comments and try to make sense out of them.
>>
>> >Regard...
>>
>> [...]
>>
>> Hmm. Is this just the usual Celko-bashing in this newsgroup? Did you
>> intend to write a parody on how to misread and misinterpert other
>> peoples texts? You don't mean this seriously, do you?

>I let people who know how to read english make their mind for
>themselves.

Ad hominem argument. But, as you like.

Celkos post was a direct answer to your post "Guidelines to a decent support of surrogate key implementation", where you wrote:

>LINE1> People who think that surrogate key should be an internal
>physical mechanism to the dbms and invisible to designer who could
>focus on the logical selection of primary keys
>LINE2> People who think that a trdbms should allow the designer to
>have some control
>LINE3> People who think there is nothing wrong with current surrogate
>key implementations
>
>I personally belong to first category.
>
>What do you think are the Pro/Cons of each approach?

Celko often gives "canned" answers. I have seen his text before at

http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=9&messageid=219520

Perhaps this is not very polite of him, but he gives a clear answer to your question: Yes, the same as you do, he argues that surrogate keys should be an internal physical mechanism to the dbms and invisible to the designer.
Furthermore, he not only just says "yes I agree with you", he puts your question in context, discussing different types of keys. His taxonomy of course is phenomenological, not mathematical, but still very useful. Your question *does* have a practical content, namely the "... implementation of surrogate keys" in *real* rdbms.

I do take exception to a series of your statements, as I try to detail below.

Cimode <cimode_at_hotmail.com> wrote:

><<I have a taxonomy of key types. We have a lot of problems with
>terminology on this one, so let me get that out of the way.>>
>The solution starts with saying we have a problem. Indeed, that's
>promising for the rest of the post.

You are very condescending.

><<There is no such thing as a "universal, one-size-fits-all" key.
>Just as no two sets of entities are the same>>
>or how to use a pointless analogy to prove a arbitrary meaningless
>idea.

You insinuate things that are not there: What exactly is meaningless about the statement that there is not a single, definitive answer to the problem of surrogate keys? His analogy may not be well chosen, but I nowhere see anything like a theorem that would need a proof.

><<, the attributes that make them unique have to be found in the
>reality of the data. >>
>What is reality of data ? Does it mean that some data is real and some
>is not?

Gratuitous misinterpretation:
"in the reality of things" clearly is not the same as "there are things that are real and there are things that are not real".

><<Here is my classification of types of keys (needs a monofont):>>
>Given what I read until know, I am already brassing for impact.
>
> natural artificial exposed surrogate
>==================================================================
>Constructed from reality |
>of the data model | Y N N Y
> |
>verifiable in reality | Y N N N
> |
>verifiable in itself | Y Y N N
> |
>visible to the user | Y Y Y N
>
><<1) A natural key is a subset of attributes which occur in a table
>and act as a unique identifier. They are seen by the user.>>>>

>And that which subset of attribute that would be ?

Condescending: I all fairness it can be assumed that people in a group called "comp.database.theory" *do* know at least something about the concept of a key.

>Let me rephrase that : *because they would be seen by the user, a
>subset of attribute would be a natural key*. First example on how to
>introduce subjectivity: define concept of natural key according to
>human perception.

Uncalled for misinterpretation of two independent sentences into a logical implication that is nowhere to be seen. Please show me the "because" in Celkos text.

><<You can go to the external reality and verify them.>>
>Can somebody define what is *external reality*? Is there an *internal
>reality* ? I would have sweared there was only but one reality.
>Second example on how to introduce subjectivity: considering there are
>several realities based on perception.

By no means all people agree with you that there is only one reality, but this is a digression. It is, at least to me, obvious in the context that he means the difference "within the dbms" versus "outside of the dbms under discussion". There *are* data that do not live within a single, specific dbms.

><<you would also like to have some validation rule. Example: UPC
>codes on consumer goods (read the package barcode) and validate them
>with a check digit or a
>manufacturer's website, geographical co-ordinates (get a GPS).>>
>So here is the final example using the *observe method*. While
>defining a natural key, it is funny to observe that we finally get to
>UPC code that may be itself a non natural key. What is natural key
>again ?

You are twisting and turning around a very good example. For every dbms that happens to be *not* the UPC dbms, the UPC number belongs to the nature out there, to the reality of data.

>To summarize that first attempt by CELKO at defining a natural key:
>
>--> A natural key is a subset of attribute. Question: Which one ?
>--> That undefined subset of attributes is indeed a natural key as
>soon as the user sees it.
>--> There are several realities. In one of these realities called
>*external reality*, the same user that saw the subset will be able to
>verify them.
>--> That verification can for instance be done on a UPC code. A UPC
>code is then a natural key (If I got this right).
>
>I let the above definition to the evaluation of people. Now let's see
>the next element of the *taxonomy* exposed for classifying of primary
>keys.

Cheap rhetorical trick of the lowest kind: You present *your* summary, which is *your* (mis-)interpretation of Celkos text, nothing more and nothing less, and then pretend that this *is* the original text. Caveat lector.

><<2) An artificial key is an extra attribute added to the table which
>is seen by the user. >>
>So an artificial key is an extra attribute added to *some* table which
>*de facto* makes it an artificial key. Let's see the consequence of
>this intriguing statement: Take a table ADD an attribute and TA TA
>you got an *artificial key*. It reminds of the old magic tricks, you
>know the one where a magician pulls a rabit from his hat. Third
>example of how to introduce subjectivity in key selection: believing
>in magic.

<irony on>
You might be closer to understanding that you know yourself. </irony off>

>
><It does not exist in the external reality, >>
>Again this *external reality* . Can somebody please explain what is
>*external reality* again (this is the second time I ask).

See above.

><<but can be verified for syntax>>
>Ah, the verification process. That, my feeble mind begins to
>understand ... BUT WAIT...NO: it should be verified for syntax. I was
>positive I understood but then my mind does not grasp anymore: what is
>verifying for syntax? What syntax ? How do I verify syntax ?

Failed attempt to make a joke, I must assume.

><< or check digits inside itself.>>
>I give up on understanding. Can somebody explain show me exactly how
>one *checks digits inside themselves*. This sentence is a masterpiece
>of things apparently only CELKO understands.

Deliberate misunderstanding: "itself" clearly refers to "artificial key", not to "check digits". IIRC, I learned that there is a distinction between singular an plural, such as "bit - bits", "key - keys", and "itself - themselves".

><<It is up to the DBA to maintain a trusted source for them.>>
>I thought we were defining an artificial key. Nw we are defining a
>DBA's role

You are thinking correctly. Part of the rôles of a DBA (and nobody else's) is to secure the integrity of the artificial key.

><<Example: the open codes in the UPC scheme which a user can assign to
>his own stuff. The check digits still work, but you have to verify them
>inside your own enterprise.>>
>Again that UPC code. But I thought the UPC code was supposed to be a
>natural key from the previous definition of natural key.

What is wrong with the fact that one and the same attribute plays different rôles in different contexts, hence is differently classified in the taxonomy?

><<If you have to construct a key yourself, it takes time to deisgn
>them, to invent a validation rule, set up audit trails, etc.>>
>What key again? The artificial ? the natural ? THe one I can verify
>in *external* reality ? By syntax? by checking digits inside itself ?
>The one that DBA maintains ? Help I am lost!

The artificial key. We are still in section 2) of Celkos text, aren't we? The artificial key that you construct yourself, that *cannot* be verified outside of the dbms, that is verified syntactically by the rules governing the construction of the key, and whose integrity is maintained by the DBA. HTH.

>To summarize that first attempt by CELKO at defining an artificial key
>(pay attention!):
>
>--> An artificial key (not a natural key) is an extra attribute added
>to *some* table
>--> An artificial key does not exists in *external reality*.
>--> An artificial key must be verified for *some* syntax ?
>--> An artificial key should have it digits checked inside themselves
>(I should have known!)

Deliberate insinuation: nowhere in Celkos text is any hint of the phrase "have it digits checked inside themselves".

>--> It's a DBA's role to maintain ALL the above source from a thrusted
>source ? What source ?

I'm sorry, I am getting too tired to go through all of the rest up to the bitter end. Just one more question:

><<Codd also wrote the following:
>There are three difficulties in employing user-controlled keys as
>permanent surrogates for entities.>>
>Again that Codd ? Who's Codd anyway?

Surely you must be joking?

Greetings
Matthias Kläy

-- 
www.kcc.ch
Received on Sun May 27 2007 - 00:06:55 CEST

Original text of this message