Re: cdt glossary 0.1.1 [Transaction]

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sat, 21 Apr 2007 04:52:39 GMT
Message-ID: <rMgWh.12187$Kd3.11974_at_newssvr27.news.prodigy.net>

"paul c" <toledobythesea_at_oohay.ac> wrote in message news:eubWh.101559$aG1.76003_at_pd7urf3no...
> Brian Selzer wrote:

>> "paul c" <toledobythesea_at_oohay.ac> wrote in message 
>> news:383Wh.100591$DE1.26701_at_pd7urf2no...
>>
>>>Brian Selzer wrote:
>> ..
>> Codd (1970) defined consistency using the terms "state" and the phrase 
>> "instantaneous value." And later emphasized it: "It is important to note 
>> that consistency as defined above is a property of the instantaneous 
>> state of a data bank..."
>> ...
>

> I don't know why he used the adjective "instantaneous" to describe a
> value. Perhaps he chose to do that to appeal to the lingo of his expected
> audience. But I have no idea what an instantaneous value versus a
> non-instantaneous value could possibly be unless domains that may appear
> and disappear, or grow and shrink are allowed (which I wouldn't mind
> considering, but I don't think Codd had that in mind).
>

I think he wanted to emphasize that the information in a database is in constant flux. To use his words,
"The totality of data in a data bank may be viewed as a collection of time-varying relations. These relations are of assorted degrees. As time progresses, each n-ary relation may be subject to insertion of additional n-tuples, deletion of existing ones, and alteration of components of any of its existing n-tuples."

> For database purposes, when Codd's information principle and the
> closed-world assumption are followed I also fail to see the need for a
> word like "state". It only encourages people to waste time imagining some
> difference between state and value when there is none.
>

Because "state" adds a temporal connotation to "value," it emphasizes the fact that each database carries with it an implicit temporal component. The closed-world assumption and the information principle do not alter the fact that each attribute value is a component of a single database. Why is this important? Because logical identity cannot be used to determine whether or not a tuple in one database is the same as a tuple in another unless both databases are in existence and current at the same point in time--that is, unless the databases are identical. If you have two relations A and B in the same database with the same heading, and each contains a tuple with the same attribute values, then it can be said that the tuples are logically identical because both sets are in existence and current at the same point in time. The tuple is a member of both relation A and relation B. Because each database exists during a half-closed interval bounded by the point-in-time that it became current and the point-in-time that another database becomes current, a tuple from the current database cannot be logically identical to a tuple in the next because before the transformation, the next database doesn't exist, and after the transformation, the original database no longer exists!

From another perspective--one that doesn't rely on the implicit temporal component: logical identity cannot be used to determine whether or not a tuple in one database is the same as a tuple in a database that is the result of a transformation from the original unless the resulting database is identical to the original database. Envision a directed graph with two nodes O and R connected by a single edge T from O to R. Let O be the original database, R be the resulting database, and T be the transformation. A tuple in O cannot be logically identical to a tuple in R because every tuple in R belongs to the database R which is the result of T. Since O is not the result of T (unless of course T is null--that is, a loop where O and R are the same node), a tuple in O cannot be logically identical to a tuple in R even if they have the same attribute values because each tuple in R carries with it an additional property: belonging to a database that is the result of T.

"State" is more precise in both cases that "value." In the former, the temporal or situational connotation applies; in the latter, the sense of being in a particular condition or having transitioned to a particular condition applies.

>
>>>
>>>>I think that if a transaction contains more than one operation, then the 
>>>>order in which each operation is evaluated is critical. 2 + 3 * 5 = 17, 
>>>>not 25.
>>>>
>>>>After the following transaction,
>>>>
>>>>UPDATE r SET x = x + 5 WHERE k = 22,
>>>>UPDATE r SET x = x * 4 WHERE k = 22
>>>>...
>>>
>>>If all I wanted to do was to add 5 to x and then multiply by 4, I would 
>>>expect my programming environment to give a single statement to the dbms, 
>>>not two.
>>>
>>
>>
>> Perhaps, but the updates above are pretty simple, how would you deal with 
>> a transaction consisting of multple updates where the tuples targeted by 
>> each overlap?
>> ...
>

> I don't see how a transaction can involve more than one "update". As Jim
> Gray might have said, one can't marry two people at once.

>
>>
>>>>Is the result (x + 5) * 4 or (x * 4) + 5?  Or is it x * 4, which is what 
>>>>D&D's multiple assignment would produce?
>>>>...
>>>
>>>I'm not in favour of encouraging the complexity that multiple assignment 
>>>requires a programmer to be aware of.  (I'm not even in favour of 
>>>assignment to mutable variables.  I realize most programmers are used to 
>>>them and expect them to be supported, but I don't care.)
>>>
>>
>>
>> I'm not sure what you mean by "mutable variables."
>> ...
>

> Maybe that's an unconventional term. I mean a variable that can be
> assigned to twice in whatever programming unit we define to constitute a
> transaction.

>
>
>
>>
>>>>Do you limit a transaction so that only one transformation can occur per 
>>>>relation? Per tuple? Per attribute value?
>>>>...
>>>
>>>Any concept of a database that has two different values at the same time 
>>>is beyond me.
>>>
>>
>>
>> Who said anything about the database having two different values at the 
>> same time?  The question involves a group of modifications to a database. 
>> The database wouldn't take on the new value until the entire transaction 
>> completed.
>> ...
>

> You did, when you said (above) that "one transformation" is a "limit".

>
>>
>>>>Should all constraints be checked after each operation?  Only some?  Or 
>>>>should they all be deferred until the end?  I mention this because the 
>>>>result of one operation may leave the database in an inconsistent state, 
>>>>making any subsequent operations suspect.
>>>>...
>>>
>>>Any dbms that allows a programmer to introduce an inconsistency should be 
>>>recalled.
>>>
>>
>>
>> Transitory inconsistencies can occur, so long as they're resolved by the 
>> time that a transaction completes.  A transaction executes in isolation, 
>> but the results of each operation in a transaction could be made visible 
>> to subsequent operations within the same transaction.
>> ...
>

> I'll repeat - any dbms that allows a programmer to introduce an
> inconsistency, transitory or otherwise, should be recalled. I'm sure that
> given the physical speed of the consumer machines that superceded the
> mainframes of his day, had they been available in 1969, Codd would not
> have suggested that the correction of inconsistencies could be postponed.
>
> p
Received on Sat Apr 21 2007 - 06:52:39 CEST

This message: [ Message body ]
Next message: Marshall: "Re: predicate, constraints, header, relvar, and relation"
Previous message: David BL: "Re: predicate, constraints, header, relvar, and relation"
Maybe in reply to: mAsterdam: "Re: cdt glossary 0.1.1 [Transaction]"
In reply to paul c: "Re: cdt glossary 0.1.1 [Transaction]"
Next in thread: David Cressey: "Re: cdt glossary 0.1.1 [Transaction]"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message