Re: How to model disjoint date ranges?

From: --CELKO-- <71062.1056_at_compuserve.com>
Date: 23 Jul 2002 12:44:20 -0700
Message-ID: <c0d87ec0.0207231144.f1a195b_at_posting.google.com>


>> I've been discussing this with him over e-mail, and one of the points is that this depends on whether the intervals allow gaps or overlaps. If gaps or overlaps are allowed, then storing start_date & end_date is right. But if there
are no gaps or overlaps, then storing start_date AND end_date means the same fact is being stored multiple times in your data. "This" interval ends when the next interval begins, and the begin date of the next interval is already
stored. <<

The complete fact is the duration, not just the start of it. The begin date for the "next" (whatever that means in a set oriented model) interval is part of that fact. Consider this:

CREATE TABLE Events
(event CHAR(5) INTEGER NOT NULL PRIMARY KEY,
 start_date TIMESTAMP NOT NULL);

INSERT INTO Events
('a', '2002-01-01'),
('b', '2002-01-10'),
('c', '2002-01-20');

Which I would show as

CREATE TABLE Events
(event CHAR(5) INTEGER NOT NULL PRIMARY KEY,
 start_date TIMESTAMP NOT NULL,
 end_date TIMESTAMP,
 CHECK (<no gaps>),
 CHECK (<no overlaps>));

INSERT INTO Events
('a', '2002-01-01', '2002-01-09'),
('b', '2002-01-10', '2002-01-19'),
('c', '2002-01-20', NULL);

Since your data model is wrong, you have no way of adding the two constraints.

Now do

DELETE FROM Events
WHERE event = 'b';

Your model will, in effect, do this to the table:

('a', '2002-01-01', '2002-01-19'), <== changed!
('c', '2002-01-20', NULL);

Where did that rule come from? Likewise, if I wanted to insert a new event, 'b1', on '2002-01-17', I would, in effect, have:

('a', '2002-01-01', '2002-01-16'), <== changed!
('b', '2002-01-17', '2002-01-19'),
('c', '2002-01-20', NULL);

I think that demonstates anomalies.

>> We ended up with sentinel dates rather than nulls, so comparison operators will continue to work with them, which is not true of nulls. <<

Use COALESCE() with NULL and you are fine. The sentinel dates are a mess; which one do you use? The particular max date for your particular product? A lot of products blow up on '999-12-31', the highest date allowed in ISO-8601. Then you have to embed that sentinel date everywhere in your code -- a bad programming practice.

We put NULLs in SQL for just this reason; use them.

SQL-92 has an OVERLAPS predicate for testing durations, so that one is easy write.

CONSTRAINT no_gaps
CHECK (NOT EXISTS

       (SELECT *
          FROM Events AS E1
         GROUP BY E1.event
        HAVING (MIN(start_date) - MAX(end_date)) +1 
               > SUM(start-date - end_date +1)))
Received on Tue Jul 23 2002 - 21:44:20 CEST

Original text of this message