Re: SQL competition: the shortest interval coalesce/pack query
Date: Fri, 03 Dec 2004 16:04:14 GMT
Message-ID: <280sd.51091$Vk6.17422_at_twister.nyc.rr.com>
"Mikito Harakiri" <mikharakiri_at_iahu.com> wrote in message news:2uPrd.61$G45.36_at_news.oracle.com...
> Write the shortest sql query that would return a minimal cover of a set of
> intervals. For example, given
>
> Intervals = {(x=1,y=3), (x=2,y=5), (x=4, y=11), (x=10,y=12), (x=20,y=21)}
>
> it should return
>
> cover(Intervals) = {(x=1,y=12), (x=20,y=21)}
>
> "Developing Time-Oriented..." book by R.Snordgrass demonstrates several
> rather lenthy solutions (pp.160+). The book is freely downloadable.
>
> To be able to compare results, lets agree that each of the clauses (select,
> from, where, group by, having) starts a new line. (Each subquery/inner view
> must obey this rule as well.) Also, no more than 2 column expressions are
> allowed to fit on a single line in the select clause. Likewise, no more than
> 2 view/tables are allowed on a single line within from clause, no more than
> 2 predicates in the each line of the where clause, etc. If anybody knows
> alternative (maybe even standard) sql formatting rules, you are welcome to
> suggest those.
CREATE TABLE Intervals
(
begin_interval INT NOT NULL,
end_interval INT NOT NULL,
CHECK (end_interval > begin_interval),
PRIMARY KEY (begin_interval, end_interval)
);
INSERT INTO Intervals (begin_interval, end_interval)
VALUES (1,3);
INSERT INTO Intervals (begin_interval, end_interval)
VALUES (2,5);
INSERT INTO Intervals (begin_interval, end_interval)
VALUES (4,11);
INSERT INTO Intervals (begin_interval, end_interval)
VALUES (10,12);
INSERT INTO Intervals (begin_interval, end_interval)
VALUES (20,21);
WITH Gaps (begin_gap, end_gap)
AS (SELECT MAX(I1.end_interval), I2.begin_interval
FROM Intervals AS I1
INNER JOIN
Intervals AS I2
ON I1.begin_interval < I2.begin_interval
GROUP BY I2.begin_interval
HAVING MAX(I1.end_interval) < I2.begin_interval)
SELECT MIN(I.begin_interval) AS begin_interval,
COALESCE(I.begin_gap, MAX(I.end_interval)) AS end_interval
FROM (SELECT I.begin_interval,
I.end_interval,
MIN(G.begin_gap) AS begin_gap
FROM Intervals AS I
LEFT OUTER JOIN
Gaps AS G
ON I.end_interval <= G.begin_gap
GROUP BY I.begin_interval, I.end_interval) AS I
GROUP BY I.begin_gap
ORDER BY begin_interval;
Executing on DB2:
begin_interval end_interval
1 12 20 21
The idea is to first find any gaps in the interval set. That is, if we have intervals (a1, a2) and (b1, b2) then (a2, b1) is a gap if there doesn't exist an interval (c1, c2) where c1 < b1 and c2 > a2. We than associate each interval with its closest succeeding gap. So each gap will be joined with all intervals that immediately precede it with no intervening gaps. Of course, an interval may have no succeeding gap so we associate those intervals with NULL (hence the LEFT OUTER JOIN). Then for each gap, including the special NULL gap, we find the minimum and maximum endpoints for the set of associated intervals.
For those who don't use a flavor of SQL that implements the WITH clause from Standard SQL, either define a view or simply expand the body of Gaps in the main query. It's only referenced once and was defined this way for readability.
-- JAGReceived on Fri Dec 03 2004 - 17:04:14 CET
