Re: Nulls, integrity, the closed world assumption and events

From: David <davidbl_at_iinet.net.au>
Date: 8 Jan 2007 15:15:00 -0800
Message-ID: <1168298100.060872.169260_at_v33g2000cwv.googlegroups.com>


JOG wrote:
> David wrote:
>
> > Consider the following relation
> >
> > person(P,M,F) :- person P has mother M, father F.
> >
> > Suppose M,F are non-nullable foreign keys with enforced referential
> > integrity back into the person relation. By induction a non-empty
> > database would have to be infinite.
> >
> > One possible solution is to allow M,F to be null. This proposal is at
> > odds with the purity of the predicate calculus.
>
> Would a more elegant and correct solution not be to have two relations:
>
> person(P, Sex)
> parentage(P, M, F)
>
> where P is the candidate key of parentage, but where P, M and F have
> enforced referential integrity (with a check constraint on sex) back to
> the person relation?

This is similar to the preferred solution in my original post. The birth event predicate allows only a subset of the persons to have their mother and father specified.

> I'd recommend forgetting the database structuring when initially
> thinking about the problem anyway and focus on the propositions you are
> trying to model - at some point you will have _at least_ two statement
> about people with no parents, indicating you have a proposition that
> does not fit into the 'parentage structuring', and hence has no place
> being there. This in turn indicates one should have a separate person
> relation.

Agreed.

> Exists:P1 with sex:F
> Exists:P2 with sex:M
> Exists:P3 with sex:M with Mother:P1 and Father:P2
> Exists:P4 with sex:F with Mother:P1 and Father:P2
> etc...
>
> Clearly there two types of propositions here.

It is clear after thinking about the closed world assumption. To the naive the claim that every person has a mother and father seems correct.

What do you think of the idea to favour direct representation of events in a RDB? It is my impression that this tends to lead to normalised designs that properly deal with the closed world assumption, avoid nulls, ensure simple updates, and makes it easy to think about strong integrity constraints.

A database used by a company as part of its running process had a beginning, and is updated as events happen. The idea to directly store the events seems very natural. The closed world assumption relates in part to a single interval on the time axis.

> > Another solution is to drop the enforced referential integrity
> > constraint. However it seems rather suspicious to pretend that some
> > parent is not a person (in the DB) even though they are mentioned in
> > the DB.
> >
> > A third solution is to regard the above person relation as bad because
> > it is at odds with the closed world assumption. Instead, it is better
> > to limit a person relation to something like
> >
> > person(P) :- P is a person
> >
> > and use other relations to represent the family tree, such as
> >
> > mother(M,C) :- M is the mother of child C
> > father(F,C) :- F is the father of child C
> >
> > Note that as it stands we have quite weak integrity constraints because
> > a person may have any number of mothers and fathers.
> >
> > Alternatively we could represent birth events
> >
> > birth(L,T,P,M,F) :- Person P was born to M,F at location L at time
> > T
> >
> > This could be keyed on attribute P, ensuring that each person can be
> > born at most once and therefore have at most one mother, one father,
> > one birthplace and one age.
> >
> > Interestingly (and IMO not surprisingly), representing the underlying
> > events that occur in space and time offers a good trade-off in terms of
> > integrity constraints, and fits in well with the closed world
> > assumption.
> >
> > Another perspective on this: relations represent facts not entities.
> > The whole idea of RM is to represent information *about* entities using
> > predicates. The idea that a record in a table represents an object
> > has more to do with the OO approach.
> >
> > Consider that we store marriage information in a person relation
> >
> > Person(P,S) :-Person P has spouse S.
> >
> > Clearly the spouse attribute would need to be nullable. It would be
> > better to store marriages in a separate relation, such as
> >
> > married(P1,P2) :- P1 and P2 are married
> >
> > But people can get married and divorced multiple times. To represent
> > this it may be better to store the information using events. Eg
> >
> > wedding(L,T,P1,P2) :- P1 married P2 at location L at time T.
> >
> > There are some nice features about using events for relational models.
> >
> > 1. Events are immutable.
> > 2. Events give us history
> > 3. The relationships between entities can vary over time.
> > 4. Events occur in space and time and therefore align well with some
> > form of closed world assumption that is localized in space/time.
> >
> > However there is an increased computational burden if the current set
> > of relationships have to be calculated from the events. This is a
> > caching issue, such as when a bank caches an account balance.
Received on Tue Jan 09 2007 - 00:15:00 CET

Original text of this message