Re: organizing tables?

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Tue, 22 Jul 2003 12:33:53 +0100
Message-ID: <bfj7js$gta$>

"Bob Badour" <> wrote in message news:FJKRa.969$ [snip]
> > OK, my 'solution' (assuming there is a problem here) is to throw the
> > to the DBMS 'safety system'. This would trap any user request "know
> > that a given update will 'work' or not" that would (or does) break a
> > limit, and report back to the user that "sorry, can't actually calculate
> > in reasonable time".
> You have just given your safety system the task of solving the halting
> problem.

Almost. I've given the safety system the task of dealing with it, not with solving it.

> > > Disallowing updates to any relvar that might have a chance of imposing a
> > > hidden constraint removes the possibility of giving users adequate
> > > independence.
> >
> > You know or are you just guessing?
> I know. While not the exact situation, I believe the following reference
> deals with a similar problem:

That's a nice paper, thanks. (and has some nice references)

Switching our argument for a moment to the task of deciding wheter a given update can be gurenteed to work, this paper would help make such a decision for say a INSERT INTO .. SELECT FROM update, where the target table has a uniuqe constraint.

Now as thier necessary and sufficent condition cannot always be tested effiecntly, there will be some update statements will always work, but the Saftey System will not let us spend the CPU to find that out (even if just 1 more cycle over the limit would have been enougth to make that decsion. - that is the halting problem).

The user then has a choose, he has a set of updates that have been guaranteed to work, and a set that have not (and maybe a further set that are guaranteed to fail). He can, if he so wishes, run with updates from both sets and take the risk. If the risk is unacceptable then he should only use the first, proven set.

If an update to the DBMS optimization algorithms extends the set of updates that can be guaranteed, then the user might want to revisit the set of updates he uses, but there is no change in the logical behaviour of the dbms. His set of updates might switch in status from 'unproven' to 'proven', if he is lucky, but there would be no behaviour change (except, that is, for the speed of update, if an update is guaranteed to work, the dbms can accept it and return control immediately to the user, making the physical update in the background. If the update might fail control can only be returned at the end of the physical update).

> One may come up with algorithms that transform some constraints such that
> they no longer mention hidden relvars, but the general case is a hard
> problem. In fact, I suspect it is fairly easy to prove that one cannot
> describe some simple constraints without explicit reference to a specific
> relvar.

Quite possibly. Maybe the question is, if I ban such database views (subsets), does this restrict users too much? Alternatively, do users value being able to treat their database as absolutely a variable more than they value flexibility in the database views that can be constructed? However, I suspect that the question really is: can we even have a self consistent model unless every database view/subset IS a variable?

>Even if the security function prevents one from spelling out the
> constraint to a user, the dbms can still enforce it.

I know, but I just don't think we should allow such database views. At least not until we're convinced that we absolutely have to.

> I have thought about it. The example involves both hidden updates and hidden
> constraints. If the manager violated a transition constraint that references
> the relvars he manipulates and the relvars hidden from view, the dbms would
> have to balk even though the security function prevents the dbms from
> mentioning the hidden relvars.
> The triggered procedures might force every transition to match the
> constraint, or they might not. In general, the dbms won't be able to decide
> whether this is the case; although, it might be able to decide some or even
> most of the time.

So if it can't decide (in some reasonable amount of time), then it should reject the proposed database view.

Implementations can then compete on the complexity of cases where they can prove that no constraints are hidden in a database view. Simple.

> Consider a situation where the dbms is upgraded and where the upgrade
> includes an additional algorithm that would determine whether a specific
> triggered procedure will guarantee a constraint.
> On the basis of this algorithm, the dbms might conclude it does not need to
> enforce a specific constraint because it now knows the data can never
> violate it. (i.e. the dbms has a proof that the constraint always evaluates
> to true for a specific update.) The dbms might run faster, but there is no
> change in logical behaviour because the update never could violate the
> constraint even when the dbms tested for it during the update.
> Consider the situation with your 'safety system'. Prior to the upgrade, the
> dbms lacks the ability to know whether to enforce the constraint and might
> conclude it is too expensive to decide the matter. (ie. the dbms balks on a
> specific update) After the upgrade, the dbms allows the update. The addition
> of a new optimization algorithm should not change the logical behaviour of
> the dbms.
> The 'safety system' approach you propose above will cause the logical
> behaviour to change even though the logic remains the same.

Nice try. However it's not updates that the dbms prevents, it's the creation of
the database view in the first place. After the upgrade, the dbms might now allow the creation of database views that it did not allow before when it couldn't
prove that no constraints were hidden.

Paul Vernon
Business Intelligence, IBM Global Services Received on Tue Jul 22 2003 - 13:33:53 CEST

Original text of this message