Re: How to model disjoint date ranges?

From: --CELKO-- <71062.1056_at_compuserve.com>
Date: 23 Jul 2002 12:12:01 -0700
Message-ID: <c0d87ec0.0207231112.1cff92bb_at_posting.google.com>


>> I've been discussing this with him over e-mail, and one of the points is that this depends on whether the intervals allow gaps or overlaps. If gaps or overlaps are allowed, then storing start_date & end_date is right. But if there
are no gaps or overlaps, then storing start_date AND end_date means the same fact is being stored multiple times in your data. "This" interval ends when the next interval begins, and the begin date of the next interval is already
stored. <<

I disagree. The duration is part of an atomic fact; you are trying to find your end point in another row, not in the row where it belongs.

CREATE TABLE Events
(event_name CHAR(5) NOT NULL PRIMARY KEY,
 start_date DATE NOT NULL);

INSERT INTO Events VALUES
('a', '2002-01-01'),
('b', '2002-01-10'),
('c', '2002-01-20');

Which I would show as:

CREATE TABLE Events
(event_name CHAR(5) NOT NULL PRIMARY KEY,
 start_date DATE NOT NULL,
 end_date DATE);

INSERT INTO Events VALUES
('a', '2002-01-01', '2002-01-09'),
('b', '2002-01-10', '2002-01-19'),
('c', '2002-01-20', NULL);

Now, let's do this:

DELETE FROM Events
WHERE event = 'b';

You have a deletion anomaly that creates the falsehood that we had a business rule to close up the gaps:

('a', '2002-01-01', '2002-01-19'),
('c', '2002-01-20', NULL);

>> > We ended up with sentinel dates rather than nulls, so comparison operators will continue to work with them, which is not true of nulls. <<

Sentinel dates do not work! Not every SQL engine takes '9999-12-31' so the code blows up when you port it. It is also a lie; that sentinel dates is a real date and you will be checking for all over your code,unable to tell true from sentinel dates. NULLs were designed for missing data, so use them. I wish we had an ETERNITY token in SQL, but it is too late for that.

Oh, minor point, ISO-8601 is the ONLT temporal format allowed in Standard SQL and string are enclosed in single quotes, not double quotes.

>> But with this schema, it's easy to generate data anomalies: <<

If you have a business rule that you do not allow gaps, then enforce it with a CHECK() constraint or a CREATE ASSERTION statement.

 CHECK(

       NOT EXISTS
       (SELECT *
          FROM Events AS E1
         GROUP BY event
        HAVING MAX(end_date) - MIN(start_date)
               > SUM(end_date - start_date +1)))

(Pardon that I did not do the full SQL-92 syntax for the date math)
Likewise you can do one for preventing overlaps

 CHECK(

       NOT EXISTS
       (SELECT *
          FROM Events AS E1, Events AS E2
         WHERE E1.event = E2.event
           AND E1.start_date BETWEEN E2.start_date AND E2.end_date))

I think I got those right, but I did not check my work. Received on Tue Jul 23 2002 - 21:12:01 CEST

Original text of this message