Re: Relation Schemata vs. Relation Variables

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 03 Sep 2006 17:38:01 GMT
Message-ID: <ZpEKg.15050$%j7.148_at_newssvr29.news.prodigy.net>

"Jon Heggland" <jon.heggland_at_idi.ntnu.no> wrote in message news:edecf7$diu$1_at_orkan.itea.ntnu.no...
> Brian Selzer wrote:
>> "Jon Heggland" <jon.heggland_at_idi.ntnu.no> wrote in message
>> news:edc6eq$507$1_at_orkan.itea.ntnu.no...
>>> Huh? On the contrary, relvar-based TCs make that no problem at all. It
>>> is only (what I call) tuple-based TCs that suffer from this problem. I
>>> think you are confused by thinking in terms of entities instead of
>>> propositions.
>>
>> No. I'm thinking in terms of sets. You can pair the elements of two
>> tuples
>> from successive database states for comparison because those elements are
>> named values, and the names don't change. You can't pair the elements of
>> two relations from successive database states for comparison because the
>> elements of a relation are NOT named--at least not with a name that
>> *always*
>> survives a change. (You CAN use tuple identifiers to work around this,
>> however.)
>
> You can pair them in any way that makes sense for the constraint at
> hand, by specifying how the pairing is to be performed. Tuple identity
> is a red herring; it is neither needed nor wanted.
>

This is a contradiction. A transition constraint must either be able to pair the elements of two relations for comparison or be given a set of pairs to compare. In order for the transition constraint to pair tuples, there must exist at least one set of identifying attributes (a key) that is *guaranteed* to remain constant throughout an update--any update. One way to do that would be to specify which key will remain constant during a particular change, but without tuple identifiers it's possible for all keys to change at the same time. Do you treat such an occurance as a delete...insert? Just as there may be constraints that apply only to tuples that are about to be removed, there may be constraints that apply only to tuples that are about to be added. How can the system know whether or not a particular transition constraint applies? Or do you just allow the change based on the assumption that the user knows what he's doing?

>> From the standpoint of propositions. If ANY attribute value is different
>> between tuples, then the tuples are not identical. You can't go any
>> further
>> than that without additional information.
>
> And there's no need to.
>
>> R JOIN R' assumes that the
>> meaning of a candidate key value will remain the same throughout the
>> change.
>> In a closed world, if you can't prove that it's true, then it's not.
>
> I don't know what you mean by this. Firstly: I cannot see that
> relvar-based TCs make any assumptions at all. Secondly: a (candidate
> key) value by itself has no intrinsic meaning. It is only when it occurs
> as part of a tuple in a relvar that it makes sense to talk about
> "meaning".
>

That's the meaning I was referring to. In a relational database, all values belong to tuples in relations and are associated with two names, an attribute name and the name of the relation schema that contains the attribute value. A candididate key embodies one set of functional dependencies that exist for a relation schema. Since functional dependencies are semantic constructs, the intrinsic value of a candidate key is augmented not only by the names of the attributes to which its elements conform, but also by all of those functional dependencies. That's why it's possible to decompose relations into sets of projections that mean the same thing. What I'm saying is that using a candidate key value beyond a single database state, such as in R JOIN R', implies that that value carries with it additional meaning beyond what is imparted to it by its definition. When you specify a candidate key, you're attaching significance to a particular set of attributes, but the significance of a particular set of attribute values is limited to a single database instance by the definition of a candidate key. Therefore, using a set of attribute values from one state to identify a tuple in another implies that that significance exceeds the bounds of a single database state. I'm not saying that that is wrong, but the model must be able to support *all* situations, and "key updates" and R JOIN R' just don't mix.

>>> This is nonsense. Within the confines of an atomic statement, there is
>>> no logical difference between "changing a tuple" and "removing a tuple
>>> and adding another". All that matters is that some propositions cease to
>>> be true, while others begin.
>>
>> There most certainly is! If you define a transition constraint that
>> states:
>> you can't delete a tuple if quantity is greater than 0, then splitting
>> such
>> an update into delete...insert must necessarily fail if any tuple has
>> quantity greater than zero!
>
> Only if you check the constraint *between* performing the delete and the
> insert. And that would be a logical mistake.
>

Only if the system knows that the particular modification is an update. With assignment, it can't be determined what kind of modification was intended.

>> In order to choose which transition constraints
>> to enforce, the system must know what kind of transition is occuring.
>> Consider a line of people waiting to be served at the bank, and a
>> transition
>> constraint that says that people can't jump in line. So, if you start
>> out
>> with (Bob, Brian, Lynn) in line, and then after a transition, you have
>> (Bob,
>> Brian, Sue, Lynn), do you reject the transition? Or did Lynn leave the
>> line
>> and then reenter after Sue? By only using R and R', you can't tell. But
>> with Rd, Ru, Ri, you can. For example, Given the schema [...]
>
> The database cannot enforce truth, only consistency.
>
> The transition should be rejected (of course depending on the actual TC
> formulation, which you unfortunately don't provide). If you want the
> database to know that Lynn left the line and then reentered, you must
> tell it so by performing two updates: a DELETE first, and an INSERT
> afterwards (or the equivalent relational assignments, of course). The
> hypothetical TCs would permit this.
>
> You do have a small point: that there may be multiple "paths" between
> two database states, and some of these paths may violate TCs and others
> not. Thus, you may need to tell the database which path corresponds to
> reality (or whatever it is you're representing). But as I show above,
> that is possible---even simple---without resorting to your transition
> sets. Your scheme adds complexity, but no power.

It is possible only if you assume that keys cannot change. In order to reach the level of stability required to define transition constraints based on R and R' that are *always* enforcible, you must in many cases resort to using object identifiers. Take the example of the line at the bank. For POSITION, there's no need for an object identifier, because the value represented is simply a natural number and it's meaning is intrinsic. For PERSON, however, do you use customer number? What if the customer number changes because the user that added the customer keyed in the wrong information and is now correcting a mistake? How do you account for that kind of change in a transition constraint that depends only on R and R'? If keys can change, then a transition constraint defined in terms of R and R' cannot always be enforced. My "transition sets" do not depend upon key stability. They provide enough information to define constraints that can be enforced even when prime attributes are affected by a change: it is up to the user to specify how tuples correlate by issuing the appropriate type of modification or set of modifications, that is, INSERT, UDPATE, DELETE, or assignment. (Here assignment *always* indicates a complete replacement of the contents of the relvar, and consequently, Ru would be empty.) Because key stability has no bearing on whether or not a constraint can be enforced, immutable, system-generated surrogate keys, object identifiers, and tuple identifiers are relegated into the physical implementation where they belong.

> --
> Jon
Received on Sun Sep 03 2006 - 19:38:01 CEST

This message: [ Message body ]
Next message: Johan Sjöström: "Re: Order details table reference live data"
Previous message: Alexandr Savinov: "Re: Order details table reference live data"
Maybe in reply to: vc: "Re: Relation Schemata vs. Relation Variables"
In reply to Brian Selzer: "Re: Relation Schemata vs. Relation Variables"
Next in thread: Jon Heggland: "Re: Relation Schemata vs. Relation Variables"
Reply: Jon Heggland: "Re: Relation Schemata vs. Relation Variables"
Reply: Brian Selzer: "Re: Relation Schemata vs. Relation Variables"
Reply: JOG: "Re: Relation Schemata vs. Relation Variables"
Reply: Jan Hidders: "Re: Relation Schemata vs. Relation Variables"
Reply: Sampo Syreeni: "Re: Relation Schemata vs. Relation Variables"
Reply: Brian Selzer: "Re: Relation Schemata vs. Relation Variables"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message