Re: A Question on Integrety

From: Brian Inglis <>
Date: Tue, 16 Dec 2003 11:02:16 GMT
Message-ID: <>

On Thu, 11 Dec 2003 22:47:52 -0000 in comp.databases.theory, "Tobin Harris" <> wrote:

>Hi there,
>I have a client who's business is to collect and improve data, wich is
>eventually published in a telephone directory. Initially, the data is
>allowed to be in a poor state: certain job roles require only finding
>minimal data and throwing it in. Later down the line, someone else has to
>improve the quality of this data so that it can be printed in the phone
>directory, perhaps by adding additional information or refining that already
>there. There are also many other stages where the rules governing the data
>are contextually sensitive to the current stage of the data in it's
>For me, this raises an interesting question about data integrety. How do I
>design a database for entities that are subject to different business rules
>throughout their lifecycle!? I could easily set up the relational integrety
>for any one "stage", but this would not be applicable at all times. For
>example, if data is ready for print, it has to conform to many rules.
>However, if it is undergoing quality control, then there are far less rules.
>I ask this mainly becuase it's not something I've come across when learnign
>about logical or physical modelling, and I was just wondering if anyone else

The stages could be defined explicitly by a stage column or implicitly by the presence of some column(s).
One common implicit method is to have a set of, for example, created by, checked by, verified by, approved by, userid and date / time columns for signoffs at each stage, with increasingly restrictive data requirements, and fewer users with application roles allowing them to sign off at each stage.

You could either add additional tables with the additional columns and rules required for each stage of improvement, or write more complex rules qualified by the (explicit or implicit) stage of the data. In either case, it would probably be useful to write predicate functions, for example, is_x_data_good_for_stage_2( key ), to determine whether the data meets all the criteria for being considered to be at the next stage of improvement.

I'd make the decision to split into different tables based on normal logical and physical design principles. For example, if you get a lot of rows in some preparatory stage, but a much smaller number in the succeeding stage, there might be a good physical case for keeping the subsequent columns in a different table. You could then drive selection of good data by a join on the latter table, rather than selecting based on column values or predicate function results.

Thanks. Take care, Brian Inglis 	Calgary, Alberta, Canada 	(Brian dot Inglis at SystematicSw dot ab dot ca)
    fake address		use address above to reply
Received on Tue Dec 16 2003 - 12:02:16 CET

Original text of this message