Re: Determining missing dates from a table (A date logic prob)

From: L. Carl Pedersen <l.carl.pedersen_at_dartmouth.edu>
Date: Wed, 22 Jun 1994 17:50:04 -0500
Message-ID: <l.carl.pedersen-220694175004_at_kip-sn-177.dartmouth.edu>


In article <CrJx1A.K2D_at_gremlin.nrtc.northrop.com>, msallwas_at_world.nad.northrop.com (Michael Sallwasser) wrote:

> In article <2tkk1v$mv_at_Venus.mcs.com> guptar_at_MCS.COM (Rajeev Gupta) writes:
> >Ok the problem is as follows :
> >
> >I have a table in Oracle7 (DAILY_SALES) which contains a row for everyday
> >with a date column (say sales_date).
> >
> >As an exception, some rows are missing for certain days [let's assume
> >aproax. 10 rows are missing from a reference date (say Nov. 1st 93) to
> >till date.]
> >
> >It is required to find out those dates which are missing from table.
> >
> >Thanks in Advance.
>
> ********************
> Sorry, I do not accept thanks in advance. If this is helpful I would
> appreciate hearing from you via e-mail.
> *******************
>
> (I assume you want to test for holes. Obviously one cannot store all
> dates, unless you subscribe to the notion that time is finite.)
>
> I will provide the logic and leave the syntax to you.
> Write a P/L*SQL command that:
>
> Selects the minimum and maximum dates from your table and stores them in
> variables.
>
> Set up a loop that cycles from the earliest to the
> latest date searching for missing date in your table.
>
> Write the exception to an exception table (that you create previously).
>
> Write a SQL command that:
>
> Reports the exceptions.
>
> ******
> Alternately, you could insert the missing dates in your table, instead of
> writing them to an exception table, if this is your actual goal.
>
> Good luck.
>
> If you have question about how to write P/L*SQL you will have
> to re-post those questions.
> --
> ============================================================================
> Michael Sallwasser | Down one path is utter dispair and hopelessness. Down
> Northrop Grumman | the other is total destruction. Let us choose wisely.
> ============================================================================

didn't see the original posting, but i once solved a similar problem.

i am going to make the very simplifying assumption that the first and last dates that should be in the table are *not* missing. in that case, this works:

 I2> create view low as

  2  select sales_date+1 sales_date from daily_sales minus
  3  select sales_date              from daily_sales minus
  4  select max(sales_date)+1       from daily_sales;

View created.

 I2> create view high as

  2  select sales_date-1 sales_date from daily_sales minus
  3  select sales_date              from daily_sales minus
  4  select min(sales_date)-1       from daily_sales;

View created.

 I2> select low.sales_date low, high.sales_date high   2 from low, high
  3 where high.sales_date =

  4          ( select min(sales_date)-1 from daily_sales
  5             where sales_date > low.sales_date )
  6 order by 1,2;

LOW HIGH

--------- ---------
08-JAN-95 08-JAN-95
18-APR-95 18-APR-95
27-JUL-95 27-JUL-95
04-NOV-95 14-DEC-95
12-FEB-96 12-FEB-96
22-MAY-96 22-MAY-96
30-AUG-96 30-AUG-96
08-DEC-96 08-DEC-96
18-MAR-97 18-MAR-97
26-JUN-97 26-JUN-97
04-OCT-97 04-OCT-97
12-JAN-98 12-JAN-98
22-APR-98 22-APR-98
31-JUL-98 31-JUL-98
08-NOV-98 08-NOV-98
16-FEB-99 16-FEB-99
27-MAY-99 27-MAY-99
04-SEP-99 04-SEP-99

18 rows selected.

i built some test data for which the above is the correct set of missing dates.

note that the two views will each return results much faster than the join, so you are in luck if all you need to know is the first date in a series of missing dates. i wasn't sure exactly what was wanted, so i gave something i thought would be usable in any situation.

if the first or last date might be missing, you can fudge them in with some additional parts in the views. send me email with your exact needs if you need help with that.

there might well be a faster way to get both ends of the missing range together in the same table. it'd be nice if there was, because this method is fairly slow. using temporary tables with indexes is almost certainly a lot faster, but temporary tables can be a pain. in your case, it might make sense.

if the table is large, you might want to put the where clauses for the range inside the views. don't get your hopes up, this will never be real fast.

also note that this is *not* a "date logic problem". the problem and the solution are identical for any data-type for which "next-element" and "previous-element" functions exist.

if anyone thinks they have a much faster way to do this non-procedurally without using temporary tables, please post or send me email! Received on Thu Jun 23 1994 - 00:50:04 CEST

Original text of this message