Re: Declaring Unenforced Constraints

From: Tony Andrews <andrewst_at_onetel.com>
Date: 18 Nov 2004 06:57:56 -0800
Message-ID: <1100789876.678277.215670_at_c13g2000cwb.googlegroups.com>


Kenneth Downs wrote:
> So, to get back to the original thread. The solution was to add what
we
> called "OK TO FAIL" to all constraints. What this meant was that the
build
> would not abort on a constraint create failure, it would write it as
a
> warning and keep going. This allowed us to get to 100% by degrees,
instead
> of in a big lurch.
>
> They eventually got fixed, mostly. At this shop the CEO would
ocassionally
> show up with somebody and say, "this is my friend Johann Schmitt,
perhaps
> you have a spot for him?" What do you know! We did! Eventually one
of
> these FOC's (friend of CEO) actually had some SQL skills, and we put
her on
> to running SELECTs to find offending data and working with the client
to
> manually purge out the violations. Then we'd run an upgrade pass,
which
> would notice the constraints were not there and would try to put them
back
> in, and would finally succeed.
>
> In the end, the bigger clients all had their legacy constraints in
place and
> new improvements came in with the constraints in place from day 1.

This sounds very similar to the position my client is currently in. They are a software house with a product that has many implementations at their customer sites. The Oracle database design pre-dates the ability to declare constraints in the database, and although primary and foreign keys have been added there are few check constraints. One core table represents about 15 subtypes of an entity, each with its own "rules" that are embedded in the application code. Frequently, new code is written by developers that violates one or more of these rules, but in subtle ways that are not caught in system testing and only cause problems to some or all customers some months later. By this time, the data error is so entrenched that it is hard to write a data fix script to put things right (rather like trying to turn a cake back into its ingredients to make a different cake). Also, the data fix scripts are quite capable of introducing yet more errors of a similar nature!

I am advocating adding constraints to enforce the rules in the database, so that bad code gets caught immediately and data fixes are not required. However, I have to recognise that this is a non-trivial undertaking, because:

  1. probably every customer has some data that violates the constraint, even if it is historic data that will never be processed again. So the "ENABLE NOVALIDATE" option will need to be used in most cases (unless it is considered worth while fixing the data at every customer site at the same time as implementing the constraint).
  2. extensive regression testing will be required, because it is highly likely that various parts of the application will be found to violate the constraint under some circumstances. Even then, we must be prepared that some customer will hit a constraint violation that out regression testing missed.

It is early days yet, and I am wondering whether to suggest we go for a "big bang" approach where we implement tons of constraints at once, regression test like mad, and then wait for the screams; or a drip-feed where we implement a smaller number of constraints with each upgrade, regression test like mad, and hopefully get less issues, spread over a longer time frame. Received on Thu Nov 18 2004 - 15:57:56 CET

Original text of this message