Re: Coul constraint *enforcement be done at compile time vs runtime ?
Date: Wed, 17 Jun 2009 04:54:52 -0700 (PDT)
On 16 juin, 14:47, "Brian Selzer" <br..._at_selzer-software.com> wrote:
> "Cimode" <cim..._at_hotmail.com> wrote in message
> > As the subject says...
> > The traditional direct image system approach consists of having
> > relations constraints implemented at the time when the relation is
> > updated meaning during UPDATE/DELETE/INSERT operations. As a part of
> > a research effort to understand better what may be a relational system
> > implementation, I have discovered that it is actually possible to
> > perform constraint enforcement at constraint declaration time.
> Can you be more specific? Are you saying that /all/ constraints can be
> enforced at compile time? When a user submits an illogical update, for
> example, a delete that targets a tuple that is referenced in an inclusion
> dependency, what do you call the process of determining that it is an
> illogical update, if not constraint enforcement? Moreover, while domain
> definitions partition the set of all particulars in the Universe into a set
> of domains and while state constraints shape the Universe into a set of
> possible worlds, transition constraints specify which possible worlds are
> directly accessible from the distinguished actual world, which is
> represented by the current database value (the redundancy in the terms,
> 'the' and 'current'.and in the terms 'database' and 'value' being
> intentional for emphasis). As a consequence, since the enforcement of
> transition constraints depend on the current database value, how can one
> determine at constraint declaration time whether a particular update
> violates a transition constraint?
I realize I have not been explicit enough by what I mean by *enforcement*. I apologize but it is difficult for me to summarize a decade effort of design into a single thread.
To answer your question, let me use a significant example that will underline the difference in approach between the way a direct image system handles an INSERT constraint check and the way a system implementing independence between the logical and physical layer would work. I won't get into too much detail since that would require very lengthy explanations onto how the relation representation scheme works but I will attempt to provide some explanations to clarify my point.
In a typical direct image system such as ORACLE, DB2 or SQL Server, the typical INSERT constraint checking is handled the following fashion (there may be some variations but the general algorhythmics remains the same):
Considering the 3 following relations SCHOOL, STUDENT where one SCHOOL may have 0 or more STUDENTS. Let's suppose I want to INSERT 3 new STUDENTS while checking that they do belong to a SCHOOL that belongs to SCHOOL body. The traditional direct image system works in the following fashion
--> A transaction context is created for the INSERT when the
instruction is submitted to the engine.
--> The *to be inserted* new STUDENT tuples in are matched against
SCHOOL to enforce constraint checking. The operation involves an *in between* (before COMMIT) issue/execution of SELECT statements that check the physical existence of matching SCHOOLS in SCHOOL relation.
--> If SOME or ALL tuples are matched, a COMMIT OK flag is issued to
allow the transaction to continue. Oppositely, when SOME or ALL of the tuples are NOT matched in SCHOOL, a COMMIT KO is issued to indicate to the system that there would be a constraint violation if the current operation would actually be completed. The COMMIT KO, an internal data sublanguage instruction, then usually ends up with a ROLLBACK external language instruction.
By the above mechanism, a traditional direct image system, the system
analyzes the *intent* at updating the body of relation and expresses
internally whether or not that intent would actually result in a
constraint violation. Among the many drawbacks that come from such
--> The necessary *analysis of intent* requires multiple SELECT
statements to be issued. These SELECT statement will occur everytime a set of tuples are to update on a specific relation and are in fact orthogonal overhead to the INSERT transaction context. For instance, that even if one issues an INSERT statement for a similar value to be inserted, SELECT statements on that value will occur as many times as the INSERT statement is issued.
--> By binding the analysis of intent solely to the transactional
context of a single INSERT transaction, the system will actually lock any other transactions from updating the content of both the STUDENT and the SCHOOL transaction. STUDENT will be locked for allowing the INSERT and SCHOOL will be locked to allow the SELECT statement (there are wayss around that but they require additional ressources to be consumed; I am thinking about ORACLE Read consistency and SQL Server SNAPSHOT mode). All this comes at a great cost as far as concurrency is concerned and resoource consumption.
There's in fact a fundamental logical problem hidding behind the above physical problem. The non distinguishability between the two contexts, the context of the operation and the context of analysis of intent is a direct consequence onto how the internal relation is to be operated. In enforcing constraints and performing a UNION, direct image systems *internally* assume the following elementary relational premises:
--> Premice 1: An user INSERT/UNION user operation between STUDENTS
and NEW_STUDENTS is in fact 2 relational internal operations PROJECTION (SELECT for analysis of intent)-UNION(COMMIT INSERT). That is a direct consequence of the fact that when operated the two relations are not of the same type so the analysis of intent imposes a necessary *conversion* process to allow the external UNION to be performed.
--> Premice2: The UNION operation occurs according to the following
STUDENTS UNION NEW_STUDENTS = STUDENTS
<--> NEW_STUDENTS = EMPTY no matter what!!!
Premice 1 is a consequence of the fact that NEW_STUDENTS internal representation has no mechanism that would represent it as STUDENT to allow the analysis of intent to be implicit. Premice 2 OTOH is a direct consequence of flawed relational theory. In fact, a more reasonnable approach is to state that
STUDENTS UNION NEW_STUDENTS = STUDENTS2 where STUDENTS, NEW_STUDENTS,, and STUDENTS2 are relations sharing the same header. As a consequence, STUDENTS2 in fact equates to STUDENTS *if and only if* NEW_STUDENTS is empty OR if NEW_STUDENTS would provoque a constraint violation. This is in fact a fundamental difference from traditional relational approach with important consequences:
--> The argument that indicate that a relational operation output such
as a UNION *always* ends up with one of the relations involved in the operation is a fallacy. That fact is a direct consequence of underestimating the relation *naming* process to the profit of headers and bodies.
--> A truly relational system can *only* perform operations between
relations of the same type. A system that does not guarantee representations can not be perform relational operations neither it can do analysis of intent efficiently.
--> A UNION operation between two relation always produces a third
--> To operate relations, a relational operation engine handler must
have an *isotyping* mechanism (same type) that would necessarily represent the relation variables as being of the same type. The isotyping mechanism is directly bound to the operation.
--> The isotyping mechanism would allow to internally solve
STUDENTS UNION NEW_STUDENTS = STUDENTS2 and consider a constraint
violation *if and only if* STUDENTS2 = STUDENTS. I can not present
how my engine does it in detail, because it would impose an entire
book to write, but what I can say is that the relation representation
scheme used allows a relation level projection *not* a tuple level.
The representation scheme allows for instance to equate two un-ary
relations without resorting to elementary tuple based projections.
--> Since the isotyping mechanism is bound to the operation, and
inherently allows the analysis of intent, each time a relational operation is compiled, the analysis of intent is performed, leaving only the UNION operation to be performed at run time
What I can say is that this engine is currently working (still fixing issues though especially with virtual relation types) but it performs basic relational operations according to the above principles. The advantages are numerous:
--> A system that operates relations not values
--> A more economic system since no overhead is done in constraint
checking. Only the demanded operation is actually executed.
--> The system implements the same mechanism between any pair of N-ary
relations. Since these relations can share the same header it greatly simplifies for instance dupplicate checking.
--> I am still working on the system language compiler and fixing some
issues and hope to have a prototype in the next 2 years...
I hope this clarified... Received on Wed Jun 17 2009 - 13:54:52 CEST