Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> comp.databases.theory -> Re: organizing tables?

Re: organizing tables?

From: Bob Badour <bbadour_at_golden.net>
Date: Tue, 22 Jul 2003 19:02:33 -0400
Message-ID: <wmkTa.1162$PW6.197750826@mantis.golden.net>


"Paul Vernon" <paul.vernon_at_ukk.ibmm.comm> wrote in message news:bfk1g8$l30$1_at_gazette.almaden.ibm.com...
> "Bob Badour" <bbadour_at_golden.net> wrote in message
> news:8YcTa.1150$K16.195515439_at_mantis.golden.net...
> > > Switching our argument for a moment to the task of deciding wheter a
given
> > > update can be gurenteed to work, this paper would help make such a
> decision
> > > for say a INSERT INTO .. SELECT FROM update, where the target table
has a
> > > uniuqe constraint.
> > >
> > > Now as thier necessary and sufficent condition cannot always be tested
> > > effiecntly, there will be some update statements will always work, but
the
> > > Saftey System will not let us spend the CPU to find that out (even if
just
> 1
> > > more cycle over the limit would have been enougth to make that
decsion. -
> > > that is the halting problem).
> >
> > Except that in the next release, due to an improved physical
implementation,
> > the dbms might reduce the CPU cost and suddenly the 'safety system' will
> > change the logical outcome. The physical implementation must not affect
the
> > logical outcome.
>
> Right, so in V4 of your dbms, a query always times out after 1 hour.

If I told V4 to evaluate the query, why should it timeout at all? First, you are assuming it will take a long time to evaluate the query, which is not necessarily true. Second, you are ignoring what I said previously that your proposed safety system would have to decide quickly whether to abort if it is to take the decision out of my hands.

> In V5
> your query now always returns in under the timeout limit.

Since I do not anticipate any timeout, my query returns regardless of the version--it's just a matter of when.

> In V4 the query estimator part of the 'safety system' tells you your query
> will definitely breach the timeout limit

Since I never gave the query a timeout limit, I have no interest in the dbms telling me this. I expect it to evaluate my query regardless.

> > Huh? A user must prepare to have any update fail and to react
appropriately
> > to the failure, because a logical change to the database might change an
> > update guaranteed to succeed into an update with the potential to fail.
>
> Logical changes (i.e. schema alterations) is for schema evolution to
handle.

Yes, of course. Logical changes might affect logical outcomes. Physical changes must not affect logical outcomes.

> Obviously this assumes that all 'repeatable' updates (and queries) are
known
> the the database so that any proposed alteration of the schema can be
> verified.

While automatic verification is always a benefit, I do not require it. In fact, I will accept a system that allows a change to the logical design to have effects on logical outcomes. As a designer, I might decide to support only partially an old version of an application. If any verification utility exists, it should warn me about behavioral changes, but it must not prevent me from instituting them.

> Obviously if there is no constraint to the effect that all schema
> changes cannot break an existing users query, then he has to accept that
> potential for fail. However a good practice is to give users advance
notice of
> such changes...

Giving users advance notice is irrelevant to a discussion of namespaces and logical database partitions. Whether advance notice is given will depend on the urgency of the change and a host of other business requirements.

> > Rejecting updates that violate the predicate of the database is inherent
to
> > the model.
>
> Only for individual updates against the current value of the database and
only
> after (or during) the physical update.

Huh? Constraints are logical not physical.

> I want to be able to classify the following updates into "Guaranteed to
Work",
> "Guaranteed to Fail" or "Can't Say"

To what purpose? The general cost of doing so is prohibitive; although, the dbms might optimize away many constraint evaluations as unecessary.

> A fixed (non-paramertised) update against *any* database value of a
given
> database schema
> A parameterised update against *any* database value of a given
database
> schema (for *any* parameter values)
>
> and maybe
> A parameterised update against the current database value (for *any*
> parameter values)
>
> plus maybe other conditions (e.g. take consideration of known database
values
> that can only be changed by the current user - i.e. locks...)

As I said previously, this is a hard problem. Ignoring the theory that says it is a hard problem won't help any.

> > If your 'safety system' allows the user to proceed with 'unsafe'
queries, I
> > do not see what function it performs.
>
> It allows you to go ahead with "Guaranteed to Work" and "Can't Say"
> queries/updates.

In other words, it does nothing. Once a query fails, it falls into the 'guaranteed to fail' category and before that it falls into the 'might succeed' category. If the dbms can detect that it falls into the 'guaranteed to fail' category at compilation, so much the better.

> The dbms does not allow subset database views where hidden constraints
exist
> that are not "Guaranteed to be Satisifed" for all possible updates for
users
> of the subset database view.

As my earlier example illustrated, this restriction is too severe, and ultimately, this decision belongs in the hands of a knowledgeable designer. You haven't addressed anything I have said.

> If a user can see a constraint on a relvar, then it's OK (in general) for
them
> to accept that an update might fail.
> If a user can't see any constraint on a relvar, then it's not OK for them
to
> accept that an update might fail IMO.

Accepted or not, they must prepare to deal with the eventuality. As I indicated earlier, rejected updates are part of the model. One ignores this fact at one's own peril.

> > Perhaps you are suggesting the dbms
> > include a tool like 'lint' for C. I have no objection to such a dba
tool,
> > but it has nothing to do with namespaces or with the logical data model.
>
> Remind me what lint does.

It is a C syntax checker in black studded leather with a bullwhip. It usually spits out several warnings per line of C code. http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=lint+C

A similar tool in a dbms might warn about every statement that might fail some constraint.

> > > > One may come up with algorithms that transform some constraints such
> that
> > > > they no longer mention hidden relvars, but the general case is a
hard
> > > > problem. In fact, I suspect it is fairly easy to prove that one
cannot
> > > > describe some simple constraints without explicit reference to a
> specific
> > > > relvar.
> > >
> > > Quite possibly. Maybe the question is, if I ban such database views
> > (subsets),
> > > does this restrict users too much?
> >
> > I strongly suspect that identifying such views is also a hard problem.
>
> Agreed. Does not mess my scheme up though. Just limits the number of
database
> views that we can guarantee are valid.

Banning anything on the basis of this messes things up.

> > > Alternatively, do users value being able to
> > > treat their database as absolutely a variable more than they value
> flexibility
> > > in the database views that can be constructed?
> >
> > Plenty of users would ask us to remove all those inconvenient
constraints to
> > variability, but that doesn't mean we should remove them.
>
> They may do. Hard to say from this distance, but agreed; users don't
always
> know what is best for them (even when they are willing to pay cash).
>
> > > However, I suspect that the question really is: can we even have a
self
> > > consistent model unless every database view/subset IS a variable?
> >
> > Since every database view is a variable, I do not understand the
importance
> > of the question.
>
> I have what seems to be a controversial definition of what a variable is:
>
> A variable is something that when set to a new value, that value is the
one
> it gets set to.

That's a gross oversimplification. In any typed system, setting a circle variable to a rectangular value should fail.

> A variable is something that can be set to any value that is not
constrained
> from not having.

I agree.

> A variable is something that if set to a value that it is not
constrained
> from not having, then that value is the one it gets set to.

I agree.

> A database that exhibits side-effects, or is a database view with hidden
> constraints is not a variable by my definition.

I disagree. Side-effects and hidden constraints simply constrain the values a variable can become.

> > > >Even if the security function prevents one from spelling out the
> > > > constraint to a user, the dbms can still enforce it.
> > >
> > > I know, but I just don't think we should allow such database views. At
> least
> > > not until we're convinced that we absolutely have to.
> >
> > I am already convinced and will rely on the judgement of a designer to
make
> > the final decision. I think the onus is on you to demonstrate that your
> > proposal solves a real problem and causes no harm.
>
> Humm, yes what problem was I solving again? ;-) Lets see: Logical data
> indpendence, view updating, the idea of a database, users views of a
database,
> security via data hidding, schema evolution, database program proving. You
> name it, this is (at least) a part of the solution.

I don't see how your proposal plausibly contributes to a solution of any of the above. It strikes me as an arbitrary constraint devised to ignore or to avoid all of the above without solution.

> [snip]
> > > So if it can't decide (in some reasonable amount of time), then it
should
> > > reject the proposed database view.
> >
> > No, it must accept it. Rejecting it means a change in the implementation
of
> > the dbms will alter the logical outcome, and I reject any such proposal
on
> > principle: the principle of physical independence, to be exact.
>
> I agree with physical independence (at least to the degree allowed by my
first
> comments in this post).
>
> The change in the outcome is
> before: we could not decide if the view was safe - view not
created
> after: we now have the reourses/expertised to decide that the
view
> is safe - view is created
>
> That no more breaks physical independence

That does break physical independence. If I develop and test a design on V5 software and we move to a more powerful computing platform where V4 has been ported but V5 has not yet been released, I don't want my design to fail for no valid logical reason. We are talking about the very essence of physical independence.

> > > Implementations can then compete on the complexity of cases where they
can
> > > prove that no constraints are hidden in a database view. Simple.
> >
> > Physical implementations must not alter the logical outcomes. They can
> > compete on performance. Your proposal sucks by any rational measure.
>
> How does it harm a user that after an upgrade they can now define a
database
> subset that they were not allowed to do before the upgrade?

Because the user might have to downgrade. It happens all the time.

> Remember I'm not saying that 'normal' database updates start working that
used
> to fail (short of those now coming in under resource limits whereas they
> before got canned), only database view definitions.

Without realising it, you are saying that I won't be able to evolve the database schema in ways that would cause me to write a view that violates an arbitrary, meaningless and physically dependent constraint. That would be intolerable.

> I see no problems.

With all due respect, you are not looking very hard.

> You seem to be denying that TIME exists (or at least that
> we're not allowed time limits because that breaks physical
independence!!).

I am not denying anything about time. If I need an answer, I need an answer. I might prefer to have it sooner than later, but if I need it, I need it regardless. If I need to secure a relvar, I need to secure a relvar. If I need to introduce a side-effect, I need to introduce a side-effect. Only a fully informed designer is in a position to decide whether a specific compromise is acceptable.

> OTTH, if you are still not convinced, then why not treat a dbms upgrade
like a
> schema change. I.e. revalidate all existing queries/updates to check for
any
> now made invalid. I can't see why we should need that here. Can you give
me an
> example where the positive change in ability to create a database view
would
> impact an application?

Reverse the change and every positive change becomes a negative change. Can you give me a justification for preventing me from supporting a version of an application when doing so would require a hidden constraint?

> > > After the upgrade, the dbms might now
> > > allow the creation of database views that it did not allow before when
it
> > > couldn't
> > > prove that no constraints were hidden.
> >
> > In other words, a change in the physical implementation changes what one
can
> > express a the logical level. Again, I reject that on principle, and I am
> > surprised you would propose it in the first place.
>
> I am just being 'logically practical'.

I disagree. You are being logically arbitrary. Until the statement violates a logical constraint, the dbms has no logical reason to reject it.

> Yours is a position that is only valid
> in the abstract world of maths where *time* can be ignored.

I am not the one pretending we can ignore computability theory. I am the one saying that, because we cannot ignore time, we must accept the non-ideal situation where a user won't know whether an update will work until the user actually tries to execute it. Before the user tries to execute the specific update, the cost of predicting success or failure is prohibitive, and after the user tries to execute it, it becomes necessary to evaluate success or failure regardless. Received on Tue Jul 22 2003 - 18:02:33 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US