Re: algebra equations for Reference and FD constraints

From: Brian Selzer <brian_at_selzer-software.com>
Date: Mon, 29 Dec 2008 08:41:36 -0500
Message-ID: <ls46l.11674$YU2.11416_at_nlpi066.nbdc.sbc.com>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:1JK5l.12656$701.981_at_newsfe12.iad...
> Brian Selzer wrote:

>> "paul c" <toledobythesea_at_oohay.ac> wrote in message 
>> news:lts4l.4362$hr3.935_at_newsfe01.iad...
>>> Brian Selzer wrote:
>>> ...
>>>> Second, this does not dispel the claim that there are some 'model' 
>>>> concepts that can't be expressed with the algebra or calculus.  In 
>>>> particular, database updates cannot be expressed.  To be sure, a value 
>>>> that is to be assigned can, but the update itself--the actual 
>>>> assignment--cannot.  Nor should it.
>>> "update" is not a relational model concept nor is "assignment".  They 
>>> are both programming language concepts and are not necessarily present 
>>> depending on the language, eg., some languages don't need assignment. 
>>> Same goes for variables, aka pointers.  Imputing any of these concepts 
>>> to the relational model is making the same illogical mistake as 
>>> criticizing the RM because of flaws in the SQL language.  The mistake 
>>> originates with the false assumption that a dbms implementation that may 
>>> have been partly inspired by Codd's original model can somehow introduce 
>>> or retro-fit concepts to that model, that he never ascribed to it.  The 
>>> mistake is mysticism at its finest.  Whereas I would say that if one 
>>> can't express a concept with either an algebra or calculus, then the 
>>> concept is not a 'model' concept in the first place.
>>>
>>
>> The Relational Model as Codd defined it involves what he called 
>> 'time-varying relations.'  These are not the static relations that are 
>> the result of algebraic expressions--though their values at any 
>> particular point in time are.  Your statement that neither 'update' nor 
>> 'assignment' are relational model concepts is patently absurd, for how 
>> can a relation vary with time unless there is some form of update or 
>> assignment involved.
>>
>> <snipped irrelevant references to SQL>
>>
>>
>

> I think this was one of the rare times when Codd was being sloppy. Taking
> the phrase at face value can only logically mean that time values are
> involved in such a relation (some people use the term but that's when they
> are talking about temporal db's). Assuming such a thing is possible
> without time attributes is a symptom of one or more of: 1) playing with
> words, 2) being willfully shallow, 3) being afflicted with dyslexia.
>

Codd wasn't being sloppy. It does not logically mean that time values are involved: it means that the instantaneous state of the data bank at one arbitrary point in time can be a different collection of relations than that for the instantaneous state of the data bank at another arbitrary point in time. There is no need to record the time value because it is irrelevant and in some cases impossible to replicate: what is the case now is likely not what was the case yesterday and is probably not what will be the case tomorrow. If the database is supposed to reflect what is the case, then the collection of relations that represents what is the case must be different from the collection of relations that represented what was the case yesterday and also from the collection of relations that will be the case tomorrow. In order to house a representation of what is actually the case, there must be a means for asserting which possible value for the database is now the actual value--that is, which collection of relations represents what is the case. That is the essence of database updates, which are definitely outside the scope of the algebra and the calculus.

Suppose you have a database that is supposed to record the moves in a game of chess. There are 20 possible initial moves for each player, and the set of all possible subsequent moves depends upon the moves that have already been played. For example, an initial e4 permits moves not only from the pawns and knights, but also from the king's bishop and the queen. If black responded to e4 with e5, then the move e4-e5 is no longer a possibility. Now, at the start of the game, no moves have been played, so the relation for recording moves starts out empty. Once a move is played, how can the database reflect that fact unless there is some means to assert that fact? Database updates are indeed a relational model concept even though neither the algebra nor the calculus are sufficient to express them..

> Here is what Date had to say about the phrase some years ago:

>

> (quote)
>

> Codd then goes on to define a "data bank" (which we would now more usually
> call a database, of course) to be "a collection of time-varying relations
> ... of assorted degrees," and states that "each [such] relation may be
> subject to insertion of additional n-tuples, deletion of existing ones,
> and alteration of components of any of its existing n-tuples." Here,
> unfortunately, we run smack into the historical confusion between relation
> values and relation variables. In mathematics (and indeed in Codd's own
> definition), a relation is simply a value, and there's just no way it can
> vary over time; there's no such thing as a "time-varying relation." But we
> can certainly have variables -- relation variables, that is -- whose
> values are relations (different values at different times), and that's
> really what Codd's "time-varying relations" are.
>

> A failure to distinguish adequately between these two distinct concepts
> has been another rich source of subsequent confusion. For this reason, I
> would have preferred to couch the discussions in the remainder of this
> series of columns in terms of relation values and variables explicitly,
> rather than in terms of just relations -- time-varying or otherwise.
> Unfortunately, however, this type of approach turned out to involve too
> much rewriting and (worse) restructuring of the material I needed to quote
> and examine from Codd's own papers, so I reluctantly decided to drop the
> idea. I seriously hope no further confusions arise from that decision!
>

> (end quote)
>

> I suspect SQL is quite relevant because a shallow reading of Codd is
> certainly a plausible cause of some of its errors.

I don't agree with Date's characterization, nor with his conception that a database is a collection of relvars. There are several reasons, not the least of which is that its intended interpretation relies upon the false assumption that key values rigidly designate individuals in the universe of discourse, but most importantly it is that Date asserts that relational assignment is a primitve operation when it is clear that information is lost when UPDATE is translated into assignment. Using the primitives insert, update and delete, it is possible to describe completely--down to the attribute value--what is different between successive databases. Relational assignment relies upon the false assumption that a key value in one database can be used to identify an individual represented by the same key value in another database. The assumption is false, and it is relevant to this discussion because each possible database corresponds to a proposition but only one of those propositions is supposed to be true at any instant. When a different possible database is asserted to be the actual database--that is, when the database is updated, the proposition that corresponds to the database that has been the actual database is supposed to be false while the proposition that corresponds to the database that is becoming the actual database is supposed to be true. So if a proposition is supposed to be false, then how can it be relied upon. It is therefore not logical to assume that a tuple with a particular set of attribute values that appears in both a possible database and the actual database maps to the same individuals in the universe of discourse. Received on Mon Dec 29 2008 - 14:41:36 CET

Original text of this message