Re: Relation Schemata vs. Relation Variables

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sat, 02 Sep 2006 19:34:13 GMT
Message-ID: <V0lKg.675$MF1.553_at_newssvr25.news.prodigy.net>

"Jon Heggland" <jon.heggland_at_idi.ntnu.no> wrote in message news:edc6eq$507$1_at_orkan.itea.ntnu.no...

> Brian Selzer wrote:

>> "Jon Heggland" <jon.heggland_at_idi.ntnu.no> wrote in message
>> news:ed9fsd$n9q$1_at_orkan.itea.ntnu.no...

>>> vc wrote:
>>>> My point is that if 'update' is shorthand for the respective relational
>>>> assignment,  then by the same token the update constraint must be
>>>> shorthand for the respective relvar constraint: IS_EMPTY(R RENAME
>>>> (Salary AS NewSalary) JOIN R' RENAME (Salary AS
>>>> OldSalary) WHERE NewSalary < OldSalary).
>>>
>>> In that case, the shorthand is insufficient, because it doesn't specify
>>> what you should join on. You could say it should join on the key by
>>> default, but what if there's more than one key? What if you want keys to
>>> be able to change?
>>

>> I think that's my whole point. UPDATE cannot be a shorthand for
>> relational
>> assignment unless keys cannot change. If keys can change, then tuples
>> from
>> successive database states cannot be paired for comparison by joining R
>> to
>> R'.

>
> Huh? On the contrary, relvar-based TCs make that no problem at all. It
> is only (what I call) tuple-based TCs that suffer from this problem. I
> think you are confused by thinking in terms of entities instead of
> propositions.
>

No. I'm thinking in terms of sets. You can pair the elements of two tuples from successive database states for comparison because those elements are named values, and the names don't change. You can't pair the elements of two relations from successive database states for comparison because the elements of a relation are NOT named--at least not with a name that *always* survives a change. (You CAN use tuple identifiers to work around this, however.) The definition of a candidate key by itself does not enable a value conforming to the key definition to identify a tuple beyond a single database state. Within a single database state, a candidate key value can be used as the "name" of the tuple so that interrelational dependencies can be defined, such as foreign key constraints and other inclusion dependencies. The problem is that keys can change. Codd alluded to it in the definitive 1970 paper--I think he used the phrase, "key updates." In 1979, he acknowledged the need for surrogates, for the simple reason that keys CAN change. Furthermore, even when using surrogates, if a relation has multiple candidate keys which are made up of combinations of surrogates, then it's possible for one or more of those keys to be changed while another remains constant. (I guess that in those cases you could add a tuple identifier....)

From the standpoint of propositions. If ANY attribute value is different between tuples, then the tuples are not identical. You can't go any further than that without additional information. R JOIN R' assumes that the meaning of a candidate key value will remain the same throughout the change. In a closed world, if you can't prove that it's true, then it's not. So, how do you prove it? Either by defining a tuple identifier that is guaranteed to remain constant throughout an update, or by requiring the user to state how each tuple correlates during an update. Obviously tuple identifiers are BAD, but the alternative requires acknowledging that UPDATE is not a shorthand for assignment.

>> So we're stuck with either implementation-specific row-level transition
>> constraints such as a FOR EACH ROW triggers or Dataphor's implementation,
>> or
>> introducing tuple identifiers or sets of object identifiers to guarantee
>> that at least one key won't change. A better solution is needed.
>> Instead
>> of exposing only the proposed state, that is, R', the transition itself
>> could also be exposed as a set of relations, Rd, Ru, Ri, representing
>> respectively the tuples that will be removed, the tuples that will be
>> different and how, and the tuples that are being added.

>
> This is nonsense. Within the confines of an atomic statement, there is
> no logical difference between "changing a tuple" and "removing a tuple
> and adding another". All that matters is that some propositions cease to
> be true, while others begin.

There most certainly is! If you define a transition constraint that states: you can't delete a tuple if quantity is greater than 0, then splitting such an update into delete...insert must necessarily fail if any tuple has quantity greater than zero! In order to choose which transition constraints to enforce, the system must know what kind of transition is occuring. Consider a line of people waiting to be served at the bank, and a transition constraint that says that people can't jump in line. So, if you start out with (Bob, Brian, Lynn) in line, and then after a transition, you have (Bob, Brian, Sue, Lynn), do you reject the transition? Or did Lynn leave the line and then reenter after Sue? By only using R and R', you can't tell. But with Rd, Ru, Ri, you can. For example, Given the schema

R {PERSON, POSITION} then Rd, Ru, Ri are as follows

Rd {PERSON, POSITION}
Ru {PERSON, POSITION, PERSON', POSITION'}
Ri {PERSON', POSITION'}

With the assignment,
R :=
  (R WHERE NOT (PERSON = "Lynn"))
  UNION RELATION {
    TUPLE {PERSON PERSON("Sue"), POSITION POSITION(3)},     TUPLE {PERSON PERSON("Lynn"), POSITION POSITION(4)}}

you cannot tell whether or not to reject the transition, but using transitions, you can:

if the value for Rd is
  {PERSON:Lynn, POSITION:3}
and Ru is empty
and Ri is
  {{PERSON':Sue, POSITION':3},
    {PERSON':Lynn, POSITION':4}},

then it's obvious that the transition should be allowed, on the other hand, if the value for Rd is empty
and Ru is
{{PERSON:Lynn, POSITION:3, PERSON':Lynn, POSITION':4}} and Ri is
{{PERSON':Sue, POSITION':3}},

then it's clear that Sue jumped in line and consequently the transition should not be allowed.

The point of this is that more information can be provided to the system with UPDATE than can be provided only by using relational assignment. It's obvious that the first transition consists of a delete and an insert, whereas the second consists of an update and an insert, but with the relational assignment, you can't determine what exactly happened, so you can't define a constraint to control it.

> --
> Jon Received on Sat Sep 02 2006 - 21:34:13 CEST

This message: [ Message body ]
Next message: Jon Heggland: "Re: Relation Schemata vs. Relation Variables"
Previous message: Jon Heggland: "Re: Relation Schemata vs. Relation Variables"
Maybe in reply to: vc: "Re: Relation Schemata vs. Relation Variables"
In reply to Brian Selzer: "Re: Relation Schemata vs. Relation Variables"
Next in thread: Jon Heggland: "Re: Relation Schemata vs. Relation Variables"
Reply: Jon Heggland: "Re: Relation Schemata vs. Relation Variables"
Reply: Brian Selzer: "Re: Relation Schemata vs. Relation Variables"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message