Re: How to query for total days?

From: Andy Triggs <andrew.triggs_at_businessobjects.com>
Date: 17 Oct 2002 05:15:53 -0700
Message-ID: <2b6e86d0.0210170415.65bdd89e@posting.google.com>

The following seems to work and should run on 8i and upwards:

DROP TABLE z;

CREATE TABLE z (member_id NUMBER, eff_dt DATE, term_dt DATE);

INSERT INTO z VALUES (1, '01-SEP-2002', '10-SEP-2002');
INSERT INTO z VALUES (1, '15-SEP-2002', '20-SEP-2002');
INSERT INTO z VALUES (2, '01-SEP-2002', '20-SEP-2002');
INSERT INTO z VALUES (2, '15-SEP-2002', '25-SEP-2002');

SELECT member_id,
       SUM(cov) as coverage

FROM
(

SELECT member_id,

         t,
         dt,
         CASE
           WHEN t = 'E'
           THEN dt - FIRST_VALUE(dt) OVER (ORDER BY member_id, dt ROWS
1 PRECEDING) + 1
           ELSE 0
         END as cov

  FROM
  (
    SELECT member_id,

           t,
           dt,
           sum(CASE WHEN t = 'S' THEN 1 ELSE -1 END) OVER (PARTITION

BY member_id ORDER BY dt) as p

FROM
(

      SELECT member_id,
             'S' as t,
             GREATEST(eff_dt, to_date('01-SEP-2002', 'DD-MON-YYYY'))
as dt
             FROM   z
             WHERE  term_dt BETWEEN to_date('01-SEP-2002',
'DD-MON-YYYY') AND to_date('30-SEP-2002', 'DD-MON-YYYY')
             UNION ALL
             SELECT member_id,
             'E' as t,
             LEAST(term_dt, '30-SEP-2002') as dt
             FROM   z
             WHERE  eff_dt BETWEEN to_date('01-SEP-2002',

'DD-MON-YYYY') AND to_date('30-SEP-2002', 'DD-MON-YYYY')

    )
  )
  WHERE (t = 'S' and p = 1) or (t = 'E' and p = 0) )
GROUP BY member_id;

How does it work? Well, the inner query identifies any records which interact with the period of analysis (in this case 01-SEP-2002 to 30-SEO-2002) and clips the start/end dates into this range. It also splits out the start/end dates into separate records.

The query nesting it computes a cumulative value (p) for each row within an id, adding one for a start record and subtracting one for an end record. This allows us to identify contiguous periods of coverage
(start at p=1 end at p=0), thus removing overlaps.

The query nesting that then computes ofr each end date, the number of days since its corresponding start date.

Finally, the coverage days are summed up by member_id.

Hope this helps.

Andy

Not sure
Richard Kuhler <noone_at_nowhere.com> wrote in message news:<Mhjr9.167514$U7.45873199_at_twister.socal.rr.com>...

> I have the following problem to solve using Oracle 9.2:
> 
> Assume a table exists with the following columns (and about 30 million
> rows):
> 
> member_id number
> insurer_id number
> eff_dt date -- no time component just the day
> term_dt date -- no time component just the day
> 
> Q. How many days was each member covered for the month of September
> 2002?
> 
> Note that the same member could be covered by multiple insurers and the
> coverages could overlap each other.  However, each day should only be
> counted once.  For example, if member_id = 100 was covered by insurer_id
> = 200 from the 1st to the 10th and and insurer_id = 300 from the 6th to
> the 15th then that member was covered for 15 days (not 10 + 10 = 20
> days).  You can use any features you like but the final result must come
> back from a single query with each member_id and the number of days.
> 
> 
> 
> background information ...
> 
> My current solution uses a pl/sql function to produce a bitmap mask for
> each row in the table representing the days covered for the given
> month.  Then I use a user-defined aggregate function (data cartridge) to
> produce the bitwise or of the masks grouped by member_id.  Finally, a
> pl/sql function is used to count the number of bits that are set in the
> final aggregated mask.
> 
> This all works as expected but it is VERY slow.  The vast majority of
> the time is spent processing the custom aggregate function.  For
> example, a MAX on the table by member_id only takes 5 minutes but my
> custom aggregate takes 270 minutes.  This seems to be a fundamental
> problem with user-defined aggregates.  There's nothing intensive about
> the bitwise or [in ODCIAggregateIterate: mask = mask - bitand(mask,
> value) + value].  I've used native compilation on the type and function
> but that had no affect on performance.  I also tried parallel execution
> (yes I implemented ODCIAggregateMerge with 'parllel_enable') but that's
> actually slightly slower on our 8 processor machine (the timing for MAX
> was not parallel).
> 
> Any ideas?
> 
> 
> Thanks,
> Richard

Received on Thu Oct 17 2002 - 07:15:53 CDT