Re: How to ensure data consistency?

From: Felix E. Klee <felix.klee_at_inka.de>
Date: Fri, 10 Sep 2004 20:26:16 +0200
Message-ID: <20040910202616.18988a79.felix.klee_at_inka.de>


On Fri, 10 Sep 2004 16:18:21 GMT Dan wrote:
> > > To avoid orphaned rows in subtables, it seems necessary to set up
> > > circular foreign key constraints.
>
> A viable alternative to this if PostgreSQL supports "instead of"
> triggers on views, is a set of views that the external world sees as
> "tables", each of which is composed of a join of the base relation
> representing the generalization and a single base relation
> representing a single specialization type. A trigger declared on this
> view would ensure the consistent population, update, or removal of
> both the generalized entity and the specialized entity with the
> application never being the wiser.
>
> This does involve the writing of some code in the trigger, but it does
> ensure a consistent manipulation of the generalization/specialization
> scheme if applications use this mechanism as its interface.

Thanks for pointing me to this solution. It's a very interesting concept that I may try out once I find the time.

> > The disadvantage of the model is of course that it contains at least
> > one null value in each row in TS.
>
> Nulls provide problems for me that are fundamental, but given the tools (and
> standard) we are given, they are sometimes difficult to avoid.

As someone who lacks the theory behind relational database design, I may miss the point why NULLs are bad. The main ugliness that I see in my data structure is that there is redundant data: For example b_t_id or c_t_id has to have the same value as t_id (enforced by a CHECK). Remember that this is necessary to ascertain (using foreign key constraints) that the relevant entries in the subtables aren't missing.

> [...]
>
> It seems like you've made a lot of progress. Thanks for posting the results
> of your efforts.

Well, actually the above solution is still incomplete (I shouldn't write stuff in the middle of the night, and I should check it - but currently my time is quite limited). For example the following bad data could be entered:

TS: t_id specialization_type b_t_id c_t_id

    1    b                   1      NULL

BS: t_id

    1

CS: t_id

    1 <-- This bad entry is not detected

To avoid this problem, the Celko solution again has to be incorporated. IOW: Foreign keys in BS and CS should point to the pair (t_id, specialization_type). There is a bunch of CHECKs involved in this solution, which should be rather obvious. Now, again the above example:

TS: t_id specialization_type b_t_id c_t_id

    1    b                   1      NULL

BS: t_id specialization_type

    1 b

CS: t_id specialization_type

    1 b <-- This bad entry is not possible because a CHECK enforces

                 specialization_type = c.

Disclaimer: I haven't tested this solution. So don't be surprised if it's still not watertight.

Felix Received on Fri Sep 10 2004 - 20:26:16 CEST

Original text of this message