Re: Little design mistakes that can be easily avoided (2): Listenning to CELKO (and CELKO alikes)

From: daveb <bestglide_at_gmail.com>
Date: Tue, 29 May 2007 18:03:42 -0700
Message-ID: <rNudnT77U7DyUMHbnZ2dnUVZ_i2dnZ2d_at_comcast.com>


"Brian Selzer" <brian_at_selzer-software.com> wrote in message news:g_G6i.12426$RX.451_at_newssvr11.news.prodigy.net...
>
> "Matthias Klaey" <mpky_at_hotmail.com> wrote in message
> news:rd6l53tm7fp7d1aju5v335gcgaebji8r0e_at_4ax.com...
> > "Brian Selzer" <brian_at_selzer-software.com> wrote:
> >
> >>
> >>"Matthias Klaey" <mpky_at_hotmail.com> wrote in message
> >>news:5f2k53pq5srphk68s1op58udei9h60stuo_at_4ax.com...
> >>> "Brian Selzer" <brian_at_selzer-software.com> wrote:
> >>>
> >>>>
> >>>>"Matthias Klaey" <mpky_at_hotmail.com> wrote in message
> >>>>news:bipi53pjf51gluul6b3b8uv86kbq7hanse_at_4ax.com...
> >>>>> Cimode <cimode_at_hotmail.com> wrote:
> >>>>>
> >>>>>>
> >>>>>>Matthias Klaey a écrit :
> >>>>>>> Cimode <cimode_at_hotmail.com> wrote:
> >>>>>>>
> >>>>>>> >On 26 mai, 18:18, Matthias Klaey <m..._at_hotmail.com> wrote:
> >>>>>>> >> Cimode <cim..._at_hotmail.com> wrote:
> >>>>>>> >> >Hi,
> >>>>>
> >>>>> [...]
> >>>>>
> > [...]
> >
> >>>>>
> >>>>
> >>>>If you've studied it, they you know that Celko's failed to mention the
> >>>>last
> >>>>paragraph of section 4, which is in direct contradiction to his rant.
> >>>>I've
> >>>>pointed this and other things out to him before, but he still dumps
the
> >>>>same
> >>>>flawed argument on people all the time. In other words, he
*knowingly*
> >>>>mischaracterizes Codd's position to advance his own agenda. (He wants
> >>>>people to buy his books!)
> >>>
> > [...]
> >
> >>> Or this one? Paper 2, Section 4 last paragraph
> >>>
> >>> "They will have to remember, however, that it is now the surrogate
> >>> that is the primary key and provides truly permanent identification
> >>> of each entity. The capability of making equi-joins on surrogates
> >>> implies that users see the headings of such columns but not the
> >>> specific values in those columns."
> >>>
> >>> And how, specifically, does any of these paragraphs contradict Celkos
> >>> "rant" (sic!)? Does it contradict the whole of it, or just part of it,
> >>> and if the latter is the case, which part?
> >>>
> >>
> >>This is the one. It directly contradicts the following paragraph in
> >>Celko's
> >>["rant" is a bit more polite than "spew" It certainly doesn't rise to
the
> >>level of an argument. The fallacies and deliberate attempt to mislead
> >>place
> >>it more in the category of propaganda than anything else.]:
> >>
> >><<
> >>This means that a surrogate ought to act like an index; created by the
> >>user, managed by the system and NEVER seen by a user. That means
> >>never used in queries, DRI or anything else that a user does.
> >
> >>
> >>Which sums up his entire argument against surrogates. He then uses the
> >>term
> >>"authority" as if in the article Codd was arguing against surrogates,
when
> >>the section he quoted (or deliberately misquoted) is a clear argument in
> >>favor of surrogates! Now, while I'm not sure I agree with Codd when it
> >>comes to hiding surrogate key values, I certainly wouldn't twist his
words
> >>completely around in order to support my own agenda!
> >
> > I don't think that Celko argues *against* surrogates, in sharp
> > contrast to "exposed physical locators". He clearly makes the
> > distinction between the two types.
> >
>
> If I understand him correctly, Celko states that the values for a
surrogate
> key are a function of the other values in a row. Clearly, this does not
> fulfill the requirement that the values provide permanent identification
for
> an object, since that value would necessarily change along with the rest
of
> the row.

I disagree. While I don't have the original reference at hand, I interpret "the other values in a row" as refering to key attributes, since including non-key attributes leads to a less useful reading.

If an identifying attribute of an external object has changed, then the corresponding key value must of course be changed in the database, so that it still refers to the same object. Since the surrogate still refers to that same object, why would its value need to change?

>
> > The only issue I see is about the visibility of surrogate key
> > *headings* in equi-joins. The values have no meaning whatsoever to the
> > user, as s/he cannot use them in Inserts, Updates, or Deletes.
> >
>
> A surrogate key value does indeed have a meaning. It is a symbol that
> represents in the abstract the essence of an object in the universe of
> discourse, that is, that which makes one object different from all other
> objects that have ever existed or will ever exist. The E-domian from RM/T
> encapsulates this metaphysical notion in one place by enumerating
> symbolically the essence of each and every object that can ever be
discussed
> (be referenced in a database extension). This means that there is a 1:1
> mapping *independent of time* between the objects under discussion and the
> values in the E-domain. As a result, a value drawn from the E-domain
> identifies an object not only throughout a single extension of the
database
> (which is all that a candidate key can guarantee), but throughout all
> possible extensions. Contrast this with the set of values for a composite
> natural key which may only distinguish an object from all others during a
> particular interval in time: if you update a row in a table where the
entire
> heading is the key, are you selecting a different object in the universe
of
> discourse, or is the new set of values the current state of affairs for
the
> same object? If there is a surrogate key, then there is no question: if
the
> surrogate key value is different, then a different object has been
selected,
> otherwise, the new set of values represents the current state of affairs
for
> the same object.

There is no notion of time in a database except that which we model. You impute a time dimension to the concept of a universe of discourse, but this is not so. Many domains have no such requirement.

If we need to refer to the values of an object at different times, then we must model a time dimension, which introduces various artifacts: e.g. non-key attributes are typically grouped into another relation keyed by the original key + timestamp. If key attributes can change, then it's typically more complicated; for example, the key could include a time interval for which it applies, along with a separate relation recording the transition from one value to another. Received on Wed May 30 2007 - 03:03:42 CEST

Original text of this message