Re: organizing tables?

From: Bob Badour <bbadour_at_golden.net>
Date: Tue, 22 Jul 2003 10:36:46 -0400
Message-ID: <8YcTa.1150$K16.195515439_at_mantis.golden.net>


"Paul Vernon" <paul.vernon_at_ukk.ibmm.comm> wrote in message news:bfj7js$gta$1_at_gazette.almaden.ibm.com...
> "Bob Badour" <bbadour_at_golden.net> wrote in message
> news:FJKRa.969$za.162044036_at_mantis.golden.net...
> [snip]
> > > OK, my 'solution' (assuming there is a problem here) is to throw the
> problem
> > > to the DBMS 'safety system'. This would trap any user request "know
> a-priori
> > > that a given update will 'work' or not" that would (or does) break a
> safety
> > > limit, and report back to the user that "sorry, can't actually
calculate
> that
> > > in reasonable time".
> >
> > You have just given your safety system the task of solving the halting
> > problem.
>
> Almost. I've given the safety system the task of dealing with it, not with
> solving it.
>
> [snip]
> > > > Disallowing updates to any relvar that might have a chance of
imposing a
> > > > hidden constraint removes the possibility of giving users adequate
> logical
> > > > independence.
> > >
> > > You know or are you just guessing?
> >
> > I know. While not the exact situation, I believe the following reference
> > deals with a similar problem:
> > http://citeseer.nj.nec.com/paulley94exploiting.html
>
> That's a nice paper, thanks. (and has some nice references)

Thank Jan. He tossed it at me a while ago. I suspect there are even more appropriate citations, but I am not sufficiently familiar with them to put my hands on them quickly.

> Switching our argument for a moment to the task of deciding wheter a given
> update can be gurenteed to work, this paper would help make such a
decision
> for say a INSERT INTO .. SELECT FROM update, where the target table has a
> uniuqe constraint.
>
> Now as thier necessary and sufficent condition cannot always be tested
> effiecntly, there will be some update statements will always work, but the
> Saftey System will not let us spend the CPU to find that out (even if just
1
> more cycle over the limit would have been enougth to make that decsion. -
> that is the halting problem).

Except that in the next release, due to an improved physical implementation, the dbms might reduce the CPU cost and suddenly the 'safety system' will change the logical outcome. The physical implementation must not affect the logical outcome.

> The user then has a choose, he has a set of updates that have been
guaranteed
> to work, and a set that have not (and maybe a further set that are
guaranteed
> to fail). He can, if he so wishes, run with updates from both sets and
take
> the risk. If the risk is unacceptable then he should only use the first,
> proven set.
>
> If an update to the DBMS optimization algorithms extends the set of
updates
> that can be guaranteed, then the user might want to revisit the set of
updates
> he uses, but there is no change in the logical behaviour of the dbms.

Huh? A user must prepare to have any update fail and to react appropriately to the failure, because a logical change to the database might change an update guaranteed to succeed into an update with the potential to fail. Rejecting updates that violate the predicate of the database is inherent to the model.

If your 'safety system' allows the user to proceed with 'unsafe' queries, I do not see what function it performs. Perhaps you are suggesting the dbms include a tool like 'lint' for C. I have no objection to such a dba tool, but it has nothing to do with namespaces or with the logical data model.

> His set
> of updates might switch in status from 'unproven' to 'proven', if he is
lucky,
> but there would be no behaviour change (except, that is, for the speed of
> update, if an update is guaranteed to work, the dbms can accept it and
return
> control immediately to the user, making the physical update in the
background.
> If the update might fail control can only be returned at the end of the
> physical update).

What would it do other than return control to the user?

> > One may come up with algorithms that transform some constraints such
that
> > they no longer mention hidden relvars, but the general case is a hard
> > problem. In fact, I suspect it is fairly easy to prove that one cannot
> > describe some simple constraints without explicit reference to a
specific
> > relvar.
>
> Quite possibly. Maybe the question is, if I ban such database views
(subsets),
> does this restrict users too much?

I strongly suspect that identifying such views is also a hard problem.

> Alternatively, do users value being able to
> treat their database as absolutely a variable more than they value
flexibility
> in the database views that can be constructed?

Plenty of users would ask us to remove all those inconvenient constraints to variability, but that doesn't mean we should remove them.

> However, I suspect that the question really is: can we even have a self
> consistent model unless every database view/subset IS a variable?

Since every database view is a variable, I do not understand the importance of the question.

> >Even if the security function prevents one from spelling out the
> > constraint to a user, the dbms can still enforce it.
>
> I know, but I just don't think we should allow such database views. At
least
> not until we're convinced that we absolutely have to.

I am already convinced and will rely on the judgement of a designer to make the final decision. I think the onus is on you to demonstrate that your proposal solves a real problem and causes no harm.

> [snip]
> > I have thought about it. The example involves both hidden updates and
hidden
> > constraints. If the manager violated a transition constraint that
references
> > the relvars he manipulates and the relvars hidden from view, the dbms
would
> > have to balk even though the security function prevents the dbms from
> > mentioning the hidden relvars.
> >
> > The triggered procedures might force every transition to match the
> > constraint, or they might not. In general, the dbms won't be able to
decide
> > whether this is the case; although, it might be able to decide some or
even
> > most of the time.
>
> So if it can't decide (in some reasonable amount of time), then it should
> reject the proposed database view.

No, it must accept it. Rejecting it means a change in the implementation of the dbms will alter the logical outcome, and I reject any such proposal on principle: the principle of physical independence, to be exact.

> Implementations can then compete on the complexity of cases where they can
> prove that no constraints are hidden in a database view. Simple.

Physical implementations must not alter the logical outcomes. They can compete on performance. Your proposal sucks by any rational measure.

> > Consider a situation where the dbms is upgraded and where the upgrade
> > includes an additional algorithm that would determine whether a specific
> > triggered procedure will guarantee a constraint.
> >
> > On the basis of this algorithm, the dbms might conclude it does not need
to
> > enforce a specific constraint because it now knows the data can never
> > violate it. (i.e. the dbms has a proof that the constraint always
evaluates
> > to true for a specific update.) The dbms might run faster, but there is
no
> > change in logical behaviour because the update never could violate the
> > constraint even when the dbms tested for it during the update.
> >
> > Consider the situation with your 'safety system'. Prior to the upgrade,
the
> > dbms lacks the ability to know whether to enforce the constraint and
might
> > conclude it is too expensive to decide the matter. (ie. the dbms balks
on a
> > specific update) After the upgrade, the dbms allows the update. The
addition
> > of a new optimization algorithm should not change the logical behaviour
of
> > the dbms.
> >
> > The 'safety system' approach you propose above will cause the logical
> > behaviour to change even though the logic remains the same.
> >
>
> Nice try. However it's not updates that the dbms prevents, it's the
creation
> of
> the database view in the first place.

Same difference. A change in the physical implementation of the dbms will change the logical outcome. In fact, this is even worse because a change in the physical implementation of the dbms will change what one can even express logically.

> After the upgrade, the dbms might now
> allow the creation of database views that it did not allow before when it
> couldn't
> prove that no constraints were hidden.

In other words, a change in the physical implementation changes what one can express a the logical level. Again, I reject that on principle, and I am surprised you would propose it in the first place. Received on Tue Jul 22 2003 - 16:36:46 CEST

Original text of this message