Re: Little design mistakes that can be easily avoided (2): Listenning to CELKO (and CELKO alikes)

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 30 May 2007 06:50:44 GMT
Message-ID: <8997i.6665$5j1.5888_at_newssvr21.news.prodigy.net>


"daveb" <bestglide_at_gmail.com> wrote in message news:rNudnT77U7DyUMHbnZ2dnUVZ_i2dnZ2d_at_comcast.com...
>

[big snip]
>
> I disagree. While I don't have the original reference at hand, I
> interpret
> "the other values in a row" as refering to key attributes, since including
> non-key attributes leads to a less useful reading.
>
> If an identifying attribute of an external object has changed, then the
> corresponding key value must of course be changed in the database, so that
> it still refers to the same object. Since the surrogate still refers to
> that
> same object, why would its value need to change?
>

Because another object could at some time in the future have the identifying attribute values that existed before the change, and thus you would have a collision.

[snip]

>
> There is no notion of time in a database except that which we model. You
> impute a time dimension to the concept of a universe of discourse, but
> this
> is not so. Many domains have no such requirement.
>

I disagree. The intension of a database describes the set of all possible extensions. The extension of a database is the realization of one and only one of the possible extensions since only one possible extension can be realized without producing a contradiction. So if there is more than zero possible extensions, then there must be at least one instance of the reality being modeled that is represented by each possible extension, and there must also be a mechanism to determine which instance is in effect. Now I grant that the instances need not map to intervals on the time continuum, but the need to identify which instance is in effect implies currency, which is a temporal concept.

> If we need to refer to the values of an object at different times, then we
> must model a time dimension, which introduces various artifacts: e.g.
> non-key attributes are typically grouped into another relation keyed by
> the
> original key + timestamp. If key attributes can change, then it's
> typically
> more complicated; for example, the key could include a time interval for
> which it applies, along with a separate relation recording the transition
> from one value to another.
>

I disagree. If a tuple means the same thing at different points in time, then there is no need to model a time dimension. Also, the key change log approach is not a good one because it increases query complexity due to the fact that the key change log is in effect a linked list. A better approach would be to duplicate the key attributes in the relation schema so that the initial key values can be recorded.. That way the complexity of queries is not affected. Received on Wed May 30 2007 - 08:50:44 CEST

Original text of this message