Re: Relation Schemata vs. Relation Variables
Date: Mon, 04 Sep 2006 13:03:42 GMT
Message-ID: <OuVKg.8075$q63.5940_at_newssvr13.news.prodigy.com>
"Jon Heggland" <jon.heggland_at_idi.ntnu.no> wrote in message
news:edfcr6$uvq$1_at_orkan.itea.ntnu.no...
> Brian Selzer wrote:
>> "Jon Heggland" <jon.heggland_at_idi.ntnu.no> wrote in message
>> news:edecf7$diu$1_at_orkan.itea.ntnu.no...
>>> You can pair them in any way that makes sense for the constraint at
>>> hand, by specifying how the pairing is to be performed. Tuple identity
>>> is a red herring; it is neither needed nor wanted.
>>
>> This is a contradiction. A transition constraint must either be able to
>> pair the elements of two relations for comparison or be given a set of
>> pairs
>> to compare.
>
> No, it just needs two relation values to compare. There's no
> contradiction. What is the problem?
>
How does it compare them? JOIN? JOIN pairs tuples based on a candidate key. If you're comparing aggregates, then I guess you could get away without pairing tuples.
>> particular change, but without tuple identifiers it's possible for all
>> keys
>> to change at the same time. Do you treat such an occurance as a
>> delete...insert?
>
> It is equivalent to to a delete-insert performed as a multiple
> assignment (i.e. without constraint checking in the middle), yes.
>
>> Just as there may be constraints that apply only to tuples
>> that are about to be removed, there may be constraints that apply only to
>> tuples that are about to be added. How can the system know whether or
>> not a
>> particular transition constraint applies?
>
> That is of course given by how the TC is formulated. With relvar-based
> constraints there isn't any classification into insert, update and
> delete constraints. There is just a condition (usually) involving an old
> relation value, and a new one. It really isn't so complicated.
>
>> What I'm saying is that using a candidate key value beyond a single
>> database state, such as in R JOIN R', implies that that value carries
>> with
>> it additional meaning beyond what is imparted to it by its definition.
>> When
>> you specify a candidate key, you're attaching significance to a
>> particular
>> set of attributes, but the significance of a particular set of attribute
>> values is limited to a single database instance by the definition of a
>> candidate key. Therefore, using a set of attribute values from one state
>> to
>> identify a tuple in another implies that that significance exceeds the
>> bounds of a single database state.
>
> I have a hard time making sense of the above. Are you saying that a
> given tuple may not mean the same thing in successive database states?
>
Not necessarily. That's certainly a possibility. How a tuple is interpreted is up to the user, not the model. On the other hand, the values determined by a candidate key value in one database state can certainly be different than the values determined by the same candidate key value in the next. So, in one sense, the "meaning" of a candidate key value can be different in successive database states. What complicates matters is that keys can change. If you define a constraint using R and R', and the key changes, then a tuple in R has the old key value and a tuple in R' has the new key value, and R JOIN R' will not include the change. Worse, if two tuples are affected by a change and the key values are swapped, then R JOIN R' will contain both, but not as might have been intended.
>> I'm not saying that that is wrong, but
>> the model must be able to support *all* situations, and "key updates" and
>> R
>> JOIN R' just don't mix.
>
> Why not? What is the problem? Earlier, I gave an example that showed how
> relvar-based TCs handle key updates *better* than tuple-based TCs. How
> does your scheme fare in that example? Have you given any concrete
> examples of how your TCs are formulated?
>
That's a sticky one. It's clear from the update statement that salary
wasn't targeted by the update, but the overall effect of the update caused
employee #1's salary to decrease. That begs the question: should an
indirect change (in one sense a side-effect) cause an update to fail? Since
additional information wasn't provided in this case, it can be safely
assumed that the update in question is just a "name" change, that is, ID
with value 0 in the current state refers to the same employee that ID with
value 1 in the proposed state does, so
IS_EMPTY(Ru WHERE Salary' < Salary)
suffices. However, if ID also participates in a foreign key to some
employee table, such as
S {ID, LAST, FIRST}
IS_EMPTY(Ru SEMIJOIN Su WHERE ID' <> ID AND Salary' < Salary)
AND IS_EMPTY((Ru SEMIMINUS Su) {ID, Salary}
JOIN ((Ru SEMIMINUS Su) {ID', Salary'} RENAME ID' AS ID) WHERE Salary'
< Salary)
SEMIMINUS Su WHERE Salary' < Salary)
AND IS_EMPTY((Rd JOIN EXTEND Ri ADD (ID' AS ID))
SEMIJOIN Su WHERE ID' <> ID AND Salary' < Salary)
I think the above is a correct formulation. The first IS_EMPTY verifies
that Salary hasn't been lowered along with a "name" change. The second
verifies that Salary hasn't been lowered as a side effect of switching IDs.
The third verifies that Salary hasn't been lowered as the result of an
assignment that targets only R. The fourth verifies that Salary hasn't been
lowered as a result of an assignment to R along with a "name" change.
The above constraint may appear complicated, but it accomplishes much more
than one that only joins R and R'. In your solution it appears that you
expect that a value for ID means the same thing in successive database
states, that is, it represents the same employee. (At least that's what I
gathered from your comments.) Because "key updates" are allowed, that just
AND IS_EMPTY((Rd JOIN EXTEND Ri ADD (ID' AS ID))
>>>> There most certainly is! If you define a transition constraint that
>>>> states:
>>>> you can't delete a tuple if quantity is greater than 0, then splitting
>>>> such
>>>> an update into delete...insert must necessarily fail if any tuple has
>>>> quantity greater than zero!
>>> Only if you check the constraint *between* performing the delete and the
>>> insert. And that would be a logical mistake.
>>
>> Only if the system knows that the particular modification is an update.
>> With assignment, it can't be determined what kind of modification was
>> intended.
>
> Please. If I perform an assignment, and the DBMS implements it as a
> delete followed by an insert, and checks my TCs in the middle, it is
> clearly faulty. It should check my TCs after the entire statement has
> been performed. This is rather elementary.
>
>>> The transition should be rejected (of course depending on the actual TC
>>> formulation, which you unfortunately don't provide). If you want the
>>> database to know that Lynn left the line and then reentered, you must
>>> tell it so by performing two updates: a DELETE first, and an INSERT
>>> afterwards (or the equivalent relational assignments, of course). The
>>> hypothetical TCs would permit this.
>>>
>>> You do have a small point: that there may be multiple "paths" between
>>> two database states, and some of these paths may violate TCs and others
>>> not. Thus, you may need to tell the database which path corresponds to
>>> reality (or whatever it is you're representing). But as I show above,
>>> that is possible---even simple---without resorting to your transition
>>> sets. Your scheme adds complexity, but no power.
>>
>> It is possible only if you assume that keys cannot change.
>
> I assume no such thing. What, precisely, is the problem with TTM-style
> TCs and relational assignment in your example? Surely you brought it up
> to prove a point?
>
>> In order to
>> reach the level of stability required to define transition constraints
>> based
>> on R and R' that are *always* enforcible, you must in many cases resort
>> to
>> using object identifiers. Take the example of the line at the bank. For
>> POSITION, there's no need for an object identifier, because the value
>> represented is simply a natural number and it's meaning is intrinsic.
>> For
>> PERSON, however, do you use customer number? What if the customer number
>> changes because the user that added the customer keyed in the wrong
>> information and is now correcting a mistake?
>
> Like I said, the database cannot enforce truth, only consistency. The TC
> in this case would say that *customer numbers* may not jump the line,
> and the DBMS would be quite right in rejecting an update of the customer
> number of a person in a line. It cannot know that the two customer
> numbers represent the same person unless you tell it so.
>
But you can tell it so when you issue an update, but that information is lost in the translation into R'. Consider the following translation of an UPDATE taken from CHAP05.pdf on the TTM web site, Page 5.24-5.25:
UPDATE r ( Ai := X, Aj := Y )
is equivalent to
( ( EXTEND r ADD ( X AS Bi, Y AS Bj ) ) { ALL BUT Ai, Aj } )
RENAME ( Bi AS Bk,Bj AS Aj, Bk AS Ai )
If either Ai or Aj is a prime attribute, then before the projection, the information required to identify tuples in both R and R' exists, but that information is lost once the projection is evaluated.
> You seem to be hung up on the idea that each tuple in a relvar
> corresponds to some real-world object or entity, and that while every
> attribute in the tuple may change, still this mystical connection to the
> entity remains. It is perhaps natural---and even convenient, I'll admit,
> if you want the database to handle TCs about "persons" without
> formalising exactly what a "person" is, though I'll have to see exactly
> how your transition set TCs actually look and work before I pass final
> judgement on that---but it is OID thinking.
>
Key updates are needed so that typos can be corrected. It's as simple as that. If your constraint joins R and R', and keys have changed, then some updates may be ignored by the constraint when a key changes along with what is constrained, and some updates that only correct typos may violate the constraint. So unless you use object identifiers, or tuple identifiers, or immutable, system-generated surrogates instead of keys that may at some point need to be updated, R and R' are not sufficient for defining transition constraints.
Each tuple in a relation represets a fact about something in the domain of discourse. If that's what you mean by "corresponds to some real-world object or entity," then, yes, I guess I am hung up on that idea. The reason is simple: values do not change, but things can; therefore, a transition constraint must be concerned with how things can change, not just how two sets of values are different.
>> Because
>> key stability has no bearing on whether or not a constraint can be
>> enforced,
>> immutable, system-generated surrogate keys, object identifiers, and tuple
>> identifiers are relegated into the physical implementation where they
>> belong.
>
> It is a matter of perspective, but I'd say that this schemes *depends*
> on OIDs, since it seems you want a given tuple to represent the same
> (say) person even when every attribute is changed. You avoid putting the
> OIDs explicitly in the logical model, but they must be present in the
> mind of the user.
>
> Note, though, that my point is not disparaging your scheme---it seems
> workable, though a bit complex. I just believe your criticism of
> relational assignment and relvar-based TCs is unfounded.
> --
> Jon
Received on Mon Sep 04 2006 - 15:03:42 CEST