Re: modelling history in a database

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Sat, 15 Feb 2003 14:01:23 -0000
Message-ID: <b2lhfd$1130$>

"Alan Gutierrez" <> wrote in message news:GVh3a.10623$
> Paul Vernon wrote:
> > "stu" <> wrote in message
> > news:b2ikjv$4rt$1$
> >>I actually did a simple version of this a while back. archive all
> >>fields to csv format and store in a memo field:
> >>name, age, height, weight,....
> > Come on. This is mean to be a theory newsgroup. You guys are just not
> > trying.
> I'm trying very hard in my application. I am tracking time series data
> through history tables and through normalization. Here I'll discuss the
> histoty tables in their latest incarnation:

OK, I probably should apologise for my rebuke. It was not targeted at all the contributors to this thread. ;-)

I will be interested to know if you decide to change your below design after reading Snodgrass. Here are my thoughts

transaction_id should logically be a timestamp because a DBMS should enforce transaction serialisation and therefore the rule that 'no two atomic actions can occur at the same timepoint'. Your DMBS might force you to use a globally assending transaction id however.

A deleted flag is not as good as recording the time_period that the row was true. Amongst other things, this allows you to model gaps in the history of some object, saves you doing a 'MAX() less than' and generally makes quering history accrros tables much easier. In your design, you could consider repacing your deletion flag, with the transaction id of the update or deletion that made that row become history, but it would be better to use timestamps.

As well as that paper of Hugh Darwen's I mentioned, you might want to glance at the 'An Overview and Analysis of TSQL2' paper on the same site after you have read some of Snodgrass, at the least it will correct the missleading impresson that TSQL2 is a way forward, that the book implies.

Also, I've toyed in the past with the correctness of having a 'transaction' entity similar to yours. In particular as the correct place to hold who deleted rows from the 'current' database. With the time-period method of storing history, a 'last inseted/updated by user' column does not cut the mustard. It's a bit disapointing that neither Snodgrass nor Date, Darwen, Lorentzos mention this idea.

> The DDL history tables assoicated triggers and past views are all
> genrated by a Perl script. Becasue it is so easy to create history
> tables, I maintain them for all the tables in my application.

Top banana. I do the same (albeit with SQL scripts). Personally it is a pity that none of the database design tools that I know (such as ERwin) directly support 'temporal modelling'. But heck, then neither do any of the RDMBS systems out there...

Paul Vernon
Business Intelligence, IBM Global Services Received on Sat Feb 15 2003 - 15:01:23 CET

Original text of this message