Re: Analytics question maybe - combine over lapping date rows

From: <jefferstone_at_gmail.com>
Date: Fri, 16 Nov 2012 07:28:13 -0800 (PST)
Message-ID: <2c6f080e-f553-471e-9490-6a934f0fcbba_at_googlegroups.com>



The solution by Maxim does not handle collapsing date ranges such as:
11-01-2012 11-09-2012
11-02-2012 11-03-2012
11-04-2012 11-05-2012
11-08-2012 11-15-2012

This should consolidate down to a single date range: 11-01-2012 11-15-2012, but the third and fourth ranges will be detected as an interval change and you end up with 3 ranges. After much googling and experimenting, I finally found a solution posted here http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/b5dd3d09-7299-4c5c-8e9d-5f06b93a9cbd, which I modified to handle null stop dates and tested on large sets of date ranges. Here is the code:

SELECT s1.primary_key,

       s1.start_datetime,
       MIN(t1.stop_datetime) AS stop_datetime
  FROM your_table s1
  JOIN your_table t1
    ON s1.primary_key = t1.primary_key
   AND (s1.start_datetime <= t1.stop_datetime -- s1 starts before t1 ends
        OR t1.stop_datetime IS NULL)
   AND NOT EXISTS         -- filter out all but the t1's with max stop time

(SELECT NULL
FROM your_table t2 WHERE t1.primary_key = t2.primary_key -- t1 stop time falls within t2 (in which case filter out t1) AND t1.stop_datetime >= t2.start_datetime AND (t1.stop_datetime < t2.stop_datetime OR t2.stop_datetime IS NULL)
    )
 WHERE NOT EXISTS
(SELECT NULL
       FROM your_table s2
      WHERE s1.primary_key = s2.primary_key
        -- s1 start time falls within s2 (in which case filter out s1)
        AND s1.start_datetime > s2.start_datetime
        AND (s1.start_datetime <= s2.stop_datetime
             OR s2.stop_datetime IS NULL)
    )
 GROUP BY s1.primary_key, s1.start_datetime  ORDER BY s1.primary_key, s1.start_datetime ; Received on Fri Nov 16 2012 - 16:28:13 CET

Original text of this message