Re: Temporal database - no end date

From: -CELKO- <jcelko212_at_earthlink.net>
Date: 16 Jan 2007 08:06:22 -0800
Message-ID: <1168963578.217570.311120_at_v45g2000cwv.googlegroups.com>


>> I have just discovered, thanks to this group, the book "Developing Time-Oriented Database Applications in SQL" by Richard T. Snodgrass which certainly seems to be an excellent reference on the subject of temporal databases. (A pdf can be downloaded for free from http://www.cs.arizona.edu/people/rts/publications.html). <,

Temporal databases are Rick's life work. He introduced time into the ANSI/ISO Standards while he was on ANSI X3H2. That is one is his "popular science and SQL code" code book; he also has decades of axademic papers to his credit and the TSQL project.

>> However, I am failing to grasp something. In the book all histories have two dates. A from date and an end date. <<

Of course. Time is a continuum and has to be modeld as half-open intervals. Remember Zeno from your freshman philosophy course? His paradoxes were the result of Greek mathematics not having a continuum and just barely having infinite-as-a-process instead of as infinite-as-a-completed-whole (Cantor and theory). Later Einstein made durstion over time the "fourth dimension" and shaped Western thought.

>> Lets say that an entity, a gym membership record always has a status, be it active, dormant cancelled, banned etc. I would like to keep a history of the status. .. Following the Snodgrass examples I would have a table like this (sorry, I just has to re-write it to comform to ISO conventions

 member_status from_date to_date


 'Active '            2007-01-01    2007-02-01
'Banned'            2007-02-01    2007-03-01
 'Active'              2007-03-01   2007-01-08


>> Given that I must always have a status, why store the "To Date". <<

 It is the proper model of time and puts a **complete fact* in one row

>> I have a simpler table, it is clear that one status stops when the other starts.<<

Then your fact is split across two rows. You have to do a self-join to discover that Mr. X was active in January 2007. You have split duration information. As another example of atomic data that is in two scalar columns, consider (longitude,latitude) pairs -- they have meaning only together; same thing with the (lft, rgt) pairs in the nested sets model, or (x,y,z) co-ords in 3D space.

>> I am really more interested in the best method in terms of SQL queries for reporting .. <<

You can use a simple calendar table and BETWEEN predicates with the proper model. There is also an ANSI Standard OVERLAPS predicate defined on the (start, finish) model. You can use a NULL for "eterntiy" when you do not know the end date and COALESCE(end_date, CURRENT_TIMESTMAMP) to construct a complete fact in your queries.

I will be covering some of this in my next book (shameless plug<g>). Received on Tue Jan 16 2007 - 17:06:22 CET

Original text of this message