Re: Newbie question about db normalization theory: redundant keys OK?

From: David Portas <REMOVE_BEFORE_REPLYING_dportas_at_acm.org>
Date: Thu, 27 Dec 2007 11:34:08 -0000
Message-ID: <GqudnUfPv5UnEO7anZ2dnUVZ8u2dnZ2d_at_giganews.com>


"Brian Selzer" <brian_at_selzer-software.com> wrote in message news:pAIcj.788$pA7.550_at_newssvr25.news.prodigy.net...
>
> "David Cressey" <cressey73_at_verizon.net> wrote in message
> news:iI7aj.7472$1X.1681_at_trndny07...
>>
>> "Brian Selzer" <brian_at_selzer-software.com> wrote in message
>> news:542aj.25103$4V6.21112_at_newssvr14.news.prodigy.net...
>>
>>>
>>> You're right about DELETE and INSERT, but not about UPDATE. If you look
>> at
>>> UPDATE in TTM (pps. 112-113), you can see what happens:
>>>
>>> UPDATE r ( Ai := X, Aj := Y)
>>>
>>> where i <> j is supposedly equivalent to
>>>
>>> ( ( EXTEND r ADD ( X AS Bi, Y AS Bj ) ) { ALL BUT Ai, Aj } )
>>> RENAME ( Bi AS Bk, Bj AS Aj, Bk AS Ai )
>>>
>>> where Bi, Bj, and Bk are arbitrary distinct attribute names that do not
>>> appear in r.
>>>
>>> Now if you look at the result of
>>>
>>> EXTEND r ADD ( X AS Bi, Y AS Bj ),
>>>
>>> you'll notice that each tuple has both the old values and the new values
>> for
>>> each affected attribute. Clearly if Ai or Aj is prime, then that
>>> information that ties each tuple in the result to its corresponding
>>> tuple
>> in
>>> r is projected away by "{ ALL BUT Ai, Aj }". Information is lost. The
>>> result may be the same, but how that result was arrived at is lost in
>>> translation, and therefore cannot be verified.
>>>
>>
>> It seems to me that, in many discussions over the past few months, you
>> have
>> been emphasizing the difference between DELETE followed by INSERT on the
>> one
>> hand, and UPDATE on the other, even when the final result of both is a
>> database in the same state.
>>
>> The question arises whether the database communicates any data other than
>> the data contained in its state. I'm going to pull a quote from Codd's
>> 1970
>> paper. Even though this quote is about consistency, rather than
>> identity,
>> I want to draw your attention to the wording.
>>
>> "It is important to note that consistency as defined above
>> is a property of the instantaneous state of a data bank, and
>> is independent of how that state came about. Thus, in
>> particular, there is no distinction made on the basis of
>> whether a user generated an inconsistency due to an act of
>> omission or an act of commission. "
>>
>> It seems to me that all of us who have been building or working with
>> relational databases, regardless of whether we use SQL as an interface,
>> or
>> some tool that might be superior to SQL, have been treating database
>> state
>> as independent of history in exactly the above sense. Not only when we
>> deal
>> with consistency, but also with regard to entity identity. I don't know
>> of
>> any revision made by D&D or others to that perspective.
>>
>> The languages described in this newsgroup differ from SQL in at least one
>> important respect. SQL distinguishes between actions and transactions.
>> For
>> reasons that they themselves have outlined, D&D have come up with a
>> language in which every transaction can be expressed as a single action.
>> I
>> can see the advantages of such a pattern, even without any prectical
>> experience in such a language. But even with this difference, the
>> difference between history and state is just about identical between SQL
>> and
>> the relational language.
>>
>> It seems to me that, if you are going to assert that the history of a
>> database is part of the interpretation of its content, it's going to be
>> up
>> to you to explain how a database can serve up any relevant portions of
>> its
>> own history as data.
>>
>> If it isn't data, we don't know it.
>>
>> I think that until this matter gets resolved somehow, there is going to
>> be
>> a continuing disconnect between you and a lot of the regulars in c.d.t.
>>
>
> It's not about history, it's about continuity. Like it or not, from a
> logical standpoint, during a successful modification there are two
> separate database values which represent what has been the case and what
> is the case. They cannot both /be/ the case, due to the closed world
> assumption. Also, key values were always intended to map to individuals
> in the universe of discourse, so since each tuple contains at least one
> key value, each tuple maps to at least one individual. The domain closure,
> unique name and closed world assumptions make it possible to establish
> both existence and identity independent of any specific interpretation, so
> it should be possible to track an individual throughout a modification,
> even if its identifying features differ before and after. The problem is
> that a key value that maps to a particular individual before the
> modification may map to a different individual after the modification, and
> consequently, any comparison that involves values from before and after
> requires greater scrutiny.
>
> Here is the problem with treating assignment as the only primitive
> operation. With assignment, all you have available is the before and
> after images of the data, but there may be many different transformations
> that could have produced the after image from the before image, and
> there's no way to tell which transformation actually occurred.

Yes there is. You say you would solve this using a "for each row trigger" mechanism. But a trigger is just a procedural construct for carrying out other relational assignments (if not then I'd like to know what else it does). Presumably the purpose of those other assigments is to record the information that was lost from the transformation represented by some hypothetical UPDATE operator. So why not record that information EXPLICITLY as part of the same operation that performed the original update - ie. an assignment?

Instead, you seem to prefer an UPDATE operation of a kind that you think is flawed - and then propose using a trigger as a crutch to protect the user from leaving out essential information!!

RM is fundamentally a type system. If you cannot agree that assignment is a primitive operation then I think you must sacrifice that very foundation. You are not describing an RDBMS as I know it.

-- 
David Portas
Received on Thu Dec 27 2007 - 12:34:08 CET

Original text of this message