Re: SQL competition: the shortest interval coalesce/pack query

From: --CELKO-- <jcelko212_at_earthlink.net>
Date: 4 Dec 2004 16:24:01 -0800
Message-ID: <18c7b3c2.0412041624.39447636_at_posting.google.com>


>> Write the shortest sql query that would return a minimal cover of a
set of intervals. <<

CREATE TABLE Intervals
(x INTEGER NOT NULL,
 y INTEGER NOT NULL,
CHECK (x <= y),
PRIMARY KEY (x,y));

DELETE FROM Intervals;

INSERT INTO Intervals VALUES (1,3);
INSERT INTO Intervals VALUES (2,5);
INSERT INTO Intervals VALUES (4,11);
INSERT INTO Intervals VALUES (10,12);

INSERT INTO Intervals VALUES (20,21);

INSERT INTO Intervals VALUES (120,130);
INSERT INTO Intervals VALUES (120,128);
INSERT INTO Intervals VALUES (120,122);
INSERT INTO Intervals VALUES (121,132);
INSERT INTO Intervals VALUES (121,122);
INSERT INTO Intervals VALUES (121,124);
INSERT INTO Intervals VALUES (121,123);
INSERT INTO Intervals VALUES (126,127);

Try this approach. Assume we have the usual Sequence auxiliary table.  Now we find all the holes in the range of the intervals and put them in a VIEW or a WITH clause derived table.

CREATE VIEW Holes (hole)
AS
SELECT seq
  FROM Sequence
 WHERE seq <= (SELECT MAX(y) FROM Intervals)    AND NOT EXISTS

       (SELECT *
          FROM Intervals
         WHERE seq BETWEEN x AND y)

UNION VALUES(0)
UNION (SELECT MAX(y) + 1 FROM Intervals);

The query picks start and end pairs that are on the edge of a hole and counts the number of holes inside that range. Covering has no holes inside its range.

SELECT Starts.x, Ends.y
  FROM Intervals AS Starts,

       Intervals AS Ends,
       Sequence AS S  -- usual auxiliary table
 WHERE S.seq BETWEEN Starts.x AND Ends.y      -- restrict seq numbers
   AND S.seq < (SELECT MAX(hole) FROM Holes)
   AND S.seq NOT IN (SELECT hole FROM Holes)    -- not a hole
   AND Starts.x - 1 IN (SELECT hole FROM Holes) -- on a left cusp 
   AND Ends.y + 1 IN (SELECT hole FROM Holes)   -- on a right cusp 
 GROUP BY Starts.x, Ends.y
HAVING COUNT(DISTINCT seq) = Ends.y - Starts.x + 1 -- no holes in range
-- ORDER BY Starts.x, Ends.y

>> To be able to compare results, lets agree that each of the clauses
...<<

I get 18 lines if I fold the VIEW into a WITH clause, but otherwise at 9 lines, or do we we add count for the IN(<subquery>) predicates? They are un-correlated, after all.

>> If anybody knows alternative (maybe even standard) SQL formatting
rules, you are welcome to suggest those. <<

Funny you should mention this. I have a book with the working title SQL PROGRAMMING STYLE that has a bunch of rules taken from ISO-11179 and readability studies. It is in a tech read right now, but I hope it will be out before mid-2005. Received on Sun Dec 05 2004 - 01:24:01 CET

Original text of this message