Re: temporal data constraint

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 25 Apr 2007 19:14:32 -0400
Message-ID: <shRXh.351$tp5.284_at_newssvr23.news.prodigy.net>


<davis.jeffrey_at_gmail.com> wrote in message news:1177523422.971585.162180_at_r35g2000prh.googlegroups.com...
> Let's say I have two relvars like:
>
> RELVAR R { A INTEGER, B INTEGER } KEY { A }
> RELVAR S { A INTEGER, C INTEGER } KEY { A }
> CONSTRAINT ( JOIN { R, S } { A } = S { A } )
>
> That is, every tuple in S has a corresponding tuple in R but the
> reverse is not necessarily true. In SQL, one might represent this by a
> single SQL table with (A, B, C) such that C is NULLable (I'm not
> suggesting that one should do this).
>
> Now, I'd like to make the relation temporal, in the way that CJ Date
> suggests in "Temporal Data and the Relational Model".
>
> It's already in 6NF. I can create "since" and "during" relations like:
>
> RELVAR R_SINCE { SINCE TIMESTAMP, A INTEGER, B INTEGER } KEY { A }
> RELVAR R_DURING { DURING INTERVAL, A INTEGER, B INTEGER } KEY { A,
> DURING }
> RELVAR S_SINCE { SINCE TIMESTAMP, A INTEGER, C INTEGER } KEY { A }
> RELVAR S_DURING { DURING INTERVAL, A INTEGER, C INTEGER } KEY { A,
> DURING }
>
> That seems like a great solution, but the problem is constraints.
> Every A must have a corresponding B at all times that A itself exists,
> but A may or may not have a C at any given time. I'd like some simple
> constraints to illustrate that in a temporal database the same way I
> represented it in the non-temporal database.
>
> More specifically:
> (1) The "during" intervals for a given A in R_DURING must form a
> continuous, non-overlapping interval ending at the "since" timestamp
> for that A in R_SINCE.
> (2) The "during" intervals for a given A in S_DURING must be non-
> overlapping, and the latest ending time of the interval must be less
> than or equal to the "since" timestamp for that A in S_SINCE, if it
> exists.
>
> These two constraints, if I'm correct, ensure that no "snapshot" in
> history (or the present) that recreates R and S ever violates the
> constraints of those relations.
>
> First of all, am I correct that those two constraints satisfy the
> requirements?

Sort of. The constraints are incorrect because it is possible for there to have been an interval during which there was no A for either R or S.

>
> Second, what's an effective way to represent these constraints in a
> readable way? Everything I come up with involves using some kind of
> MAX aggregate, but I don't know what the maximum interval ending time
> is in the case that R_DURING or S_DURING is empty.
>
> Also, how does one effectively query historical data when the key (in
> this case A) changes over time?
>

I'm probably going to be hammered by others on this group for this, but I don't think it's practical without introducing some form of invariant attribute or set of attributes into the schema. One option would be to use a surrogate to identify the entity. Another would be to extend the schema so that the original key value and the time that it came into existence could be recorded. These attributes would theoretically survive any transformation, and since no two entities could have the same key value at the same point in time, they could be used to identify the entity throughout its lifetime.

> This is an open-ended question, any guidance is appreciated.
>
Received on Thu Apr 26 2007 - 01:14:32 CEST

Original text of this message