Re: Newbie question about db normalization theory: redundant keys OK?

From: Brian Selzer <brian_at_selzer-software.com>
Date: Thu, 27 Dec 2007 07:34:45 GMT
Message-ID: <pAIcj.788$pA7.550_at_newssvr25.news.prodigy.net>


"David Cressey" <cressey73_at_verizon.net> wrote in message news:iI7aj.7472$1X.1681_at_trndny07...
>
> "Brian Selzer" <brian_at_selzer-software.com> wrote in message
> news:542aj.25103$4V6.21112_at_newssvr14.news.prodigy.net...
>
>>
>> You're right about DELETE and INSERT, but not about UPDATE. If you look
> at
>> UPDATE in TTM (pps. 112-113), you can see what happens:
>>
>> UPDATE r ( Ai := X, Aj := Y)
>>
>> where i <> j is supposedly equivalent to
>>
>> ( ( EXTEND r ADD ( X AS Bi, Y AS Bj ) ) { ALL BUT Ai, Aj } )
>> RENAME ( Bi AS Bk, Bj AS Aj, Bk AS Ai )
>>
>> where Bi, Bj, and Bk are arbitrary distinct attribute names that do not
>> appear in r.
>>
>> Now if you look at the result of
>>
>> EXTEND r ADD ( X AS Bi, Y AS Bj ),
>>
>> you'll notice that each tuple has both the old values and the new values
> for
>> each affected attribute. Clearly if Ai or Aj is prime, then that
>> information that ties each tuple in the result to its corresponding tuple
> in
>> r is projected away by "{ ALL BUT Ai, Aj }". Information is lost. The
>> result may be the same, but how that result was arrived at is lost in
>> translation, and therefore cannot be verified.
>>
>
> It seems to me that, in many discussions over the past few months, you
> have
> been emphasizing the difference between DELETE followed by INSERT on the
> one
> hand, and UPDATE on the other, even when the final result of both is a
> database in the same state.
>
> The question arises whether the database communicates any data other than
> the data contained in its state. I'm going to pull a quote from Codd's
> 1970
> paper. Even though this quote is about consistency, rather than
> identity,
> I want to draw your attention to the wording.
>
> "It is important to note that consistency as defined above
> is a property of the instantaneous state of a data bank, and
> is independent of how that state came about. Thus, in
> particular, there is no distinction made on the basis of
> whether a user generated an inconsistency due to an act of
> omission or an act of commission. "
>
> It seems to me that all of us who have been building or working with
> relational databases, regardless of whether we use SQL as an interface,
> or
> some tool that might be superior to SQL, have been treating database state
> as independent of history in exactly the above sense. Not only when we
> deal
> with consistency, but also with regard to entity identity. I don't know
> of
> any revision made by D&D or others to that perspective.
>
> The languages described in this newsgroup differ from SQL in at least one
> important respect. SQL distinguishes between actions and transactions.
> For
> reasons that they themselves have outlined, D&D have come up with a
> language in which every transaction can be expressed as a single action.
> I
> can see the advantages of such a pattern, even without any prectical
> experience in such a language. But even with this difference, the
> difference between history and state is just about identical between SQL
> and
> the relational language.
>
> It seems to me that, if you are going to assert that the history of a
> database is part of the interpretation of its content, it's going to be
> up
> to you to explain how a database can serve up any relevant portions of its
> own history as data.
>
> If it isn't data, we don't know it.
>
> I think that until this matter gets resolved somehow, there is going to
> be
> a continuing disconnect between you and a lot of the regulars in c.d.t.
>

It's not about history, it's about continuity. Like it or not, from a logical standpoint, during a successful modification there are two separate database values which represent what has been the case and what is the case. They cannot both /be/ the case, due to the closed world assumption. Also, key values were always intended to map to individuals in the universe of discourse, so since each tuple contains at least one key value, each tuple maps to at least one individual. The domain closure, unique name and closed world assumptions make it possible to establish both existence and identity independent of any specific interpretation, so it should be possible to track an individual throughout a modification, even if its identifying features differ before and after. The problem is that a key value that maps to a particular individual before the modification may map to a different individual after the modification, and consequently, any comparison that involves values from before and after requires greater scrutiny.

Here is the problem with treating assignment as the only primitive operation. With assignment, all you have available is the before and after images of the data, but there may be many different transformations that could have produced the after image from the before image, and there's no way to tell which transformation actually occurred. This means that whether or not a comparison can be considered valid depends upon an arbitrary choice made by the designer of the database. To me that seems insane. Especially since when insert, update and delete are treated as primitive, key stability no longer has an impact on whether or not a comparison can be considered valid: there can be no comparison for inserted or deleted information, and for updated information, the old values are automatically correlated with the new values, so a tuple-by-tuple comparison can be performed. Received on Thu Dec 27 2007 - 08:34:45 CET

Original text of this message