Declaring Unenforced Constraints

From: Laconic2 <laconic2_at_comcast.net>
Date: Fri, 5 Nov 2004 10:33:09 -0500
Message-ID: <ibSdndOo5dOgABbcRVn-2g_at_comcast.com>



There's a feature I would like to see in today's RDBMSes. It's the ability to declare a constraint to be true, but to tell the DBMS not to bother enforcing it. I'm talking here about SQLRDBMSes, and probably TRDBMSes as well.

So what's the point?

Well, there are a lot of data architects who don't want the DBMS to enforce a constraint "for performance reasons". The merits of that argument vary widely from case to case, but there's some merit to it. It's particularly true if the data can be guaranteed to be "cleansed" by the transactions that write the data.

Is it possible to do such a thing? sure. Present day RDBMSes have a feature that declares an index "inactive", meaning that it still exists, and it's definition is still there, but updates don't have to update the index, at least not right now. You could do the same thing with constraints.

Yes, but why declare a constraint if the DBMS isn't going to enforce it?

There are several agents or persons that might benefit.

Let me start with the optimizer. There are many situations where "SELECT DISTINCT" and "SELECT ALL" can be proven to yield the same results, but one of them costs a lot more than the other. It would be useful to be able to let the user do a "SELECT DISTINCT", but let the optimizer pick a cheaper "SELECT ALL" strategy, if it's going to give the right answer.

Very often, the proof that "SELECT ALL" and "SELECT DISTINCT" are equivalent for a given query depends on the constraints placed on the data. It doesn't matter whether the constraints are enforced by the DBMS or the transactions that write the data. As long as we can be guaranteed that the constraint holds, the optimizer can do a better job knowing the constraints than not knowing them.

Second, the engineers who build the RDBMS can write a "constraint validator" that simply goes through all the unenforced constraints, and looks for exceptions in the data. Thus if any transaction slips up, and writes data that violates the constraint, we will find out about it, although not right away.

In fact, this is useful at present. In Oracle RDBMS, there's an option under SQL*Loader to bypass constraint checking. It makes SQL*Loader run faster. When you finish doing that, it's real useful to be able to know that the data just loaded in fact conforms to the constraints.

There may be implications of this for Kenneth Down's "system of record for metadata", but I'm not at all sure about this.

Your turn. Received on Fri Nov 05 2004 - 16:33:09 CET

Original text of this message