Re: Temporal database - no end date

From: <kvnkrkptrck_at_gmail.com>
Date: 19 Jan 2007 16:18:07 -0800
Message-ID: <1169252287.769277.7770_at_l53g2000cwa.googlegroups.com>


Bob Badour wrote:
> Kev wrote:
> > Thanks for taking the time to reply Bob.
> >
> > For everyones benefit I assume you are talking about the book
> >
> > Temporal Data & the Relational Model,
> > First Edition, C.J. Date, Hugh Darwen, Nikos Lorentzos
> > (Morgan Kaufmann, 2002);
> > ISBN 1-55860-855-9
> >
> > I will take a look.
> >
> >
> > Does anyone know if I might find formal answer to my original question
> > in this book, or anywhere even?
> >
> > Thanks all.
> >
> > Bob Badour wrote:
> >
> >
> >>If you are satisfied with SQL kludges, then I suppose Snodgrass is okay.
> >>If you are interested in something more robust, I suggest you google
> >>Lorentzos and check out the Date, Darwen, Lorentzos material.
>
> A number of ways exist to model the problem of open intervals. In the
> Date/Darwen/Lorentzos book, they opt to make all intervals inclusive. To
> model an open interval in that system, one would use the largest
> representable value as the end of the interval, which for practical
> purposes is the same as the other methods.
>
> One might choose to separate the starts and the ends of intervals into
> separate relations. In that model, a start without a subsequent end or
> an end without a prior start would indicate open intervals. However, one
> immediately encounters a number of problems. For example, one will
> generally have to resort to a lot of special cases to make use of the
> open intervals, and one will find it difficult--if not impossible--to
> express many common constraints.

I'm a bit surprised that the standard "missing information" response isn't used here: namely, that the mere existence of "missing information" is indicative of a poor database design which allows one to record invalid propostions. Why not just avoid the problem altogether by realizing that the date range mess arises because two different propositions are being forced into a single relation:

Permanent_Membership (PM) is a proposition: "PERSON is a member at any time after BEGIN_DATE."
Expiring_Membership (EM) is a proposition: "PERSON is a member at any time between BEGIN_DATE and END_DATE".

In most modern DBMS's, this would certainly be tricky to implement (like many "missing information" solutions). PM might have a simple PK on PERSON; EM might enforce an interval-based PK on PERSON over [BEGIN_DATE, END_DATE] using functional indexes (or does ANSI SQL allow this directly?). To enforce the cross-table constraint that PM.BEFORE_DATE must be greater than EM.END_DATE for all PERSON in both PM and EM, triggers on both tables would be needed. But in theory, I can't see any reason a DBMS couldn't allow for declaring a distributed key over potentially open time intervals.

The benefit of such design is similar to that of most fixes of the missing information "problem": avoiding the recording of dubious propositions like: "There is a membership held by Bob from December 1, 2005 to December 31, 9999" or non-sensical propositions like "There is a membership held by Bob from December 1, 2005 to <NULL>." It would also avoid projections of dubious propositions: "The membership held by Bob will expire in 7994 years", which could lead to even more bizarre statments like "The average length of membership at our gym is 7331 years".

However, as D/D/L seemed to go in a different direction, I'm wondering if maybe I left something out of the equation... Received on Sat Jan 20 2007 - 01:18:07 CET

Original text of this message