Home » SQL & PL/SQL » SQL & PL/SQL » Sql - Related data difficult question
icon5.gif  Sql - Related data difficult question [message #206598] Thu, 30 November 2006 13:14 Go to next message
Pastonp
Messages: 13
Registered: November 2006
Junior Member
Hi ,

I have had several attempts at solving this particular problem but to no avail. It would be easier as pl/sql but I really would like to try to find a sql answer if possible.

The Input data :
ID ATTRIB  S_DATE        E_DATE
1  A       01-JAN-2006   31-MAR-2006
1  A       01-APR-2006   31-MAY-2006
1  B       01-JUN-2006   31-AUG-2006
1  A       01-SEP-2006   31-OCT-2006
2  A       01-JAN-2006   31-MAY-2006
2  A       01-JUN-2006   31-JUL-2006
2  A       01-AUG-2006   31-OCT-2006

The output I require :
ID ATTRIB  S_DATE        E_DATE
1  A       01-JAN-2006   31-MAY-2006
1  B       01-JUN-2006   31-AUG-2006
1  A       01-SEP-2006   31-OCT-2006
2  A       01-JAN-2006   31-OCT-2006

So for the record with ID of 2 a simple group by with min and max would do the trick but for the record with ID of 1 it is much more difficult. I have tried a number of combinations using functions like rank..partition by etc but could not get anything to work. I am starting to think that this is just not possible unless I use a cursor and pl/sql.

If anyone has any ideas it would be greatly appreciated - I really have spent hours and hours trying to solve this Sad

Phil








Re: Sql - Related data difficult question [message #206682 is a reply to message #206598] Fri, 01 December 2006 01:31 Go to previous messageGo to next message
skooman
Messages: 912
Registered: March 2005
Location: Netherlands
Senior Member
[Edit: oops, overlooked something]

What is the logic for the selection on records with ID 1 and attrib A?

[Updated on: Fri, 01 December 2006 01:33]

Report message to a moderator

Re: Sql - Related data difficult question [message #206708 is a reply to message #206682] Fri, 01 December 2006 03:31 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
The OP is looking to get the start and end dates of sets of rows where the end date of row n is 1 day before the start date of row n+1.
Or at least, I hope that's what they're looking for, because that's what I've just derived.
It's time for this weeks episode of 'Fun with Analytics', featuring those cheeky chaps Lag and Lead.

First, we create some data.
create table temp_olap (id number, attrib  varchar2(1), from_date date, to_date date);

insert into temp_olap values(1,  'A',       '01-JAN-2006',   '31-MAR-2006');
insert into temp_olap values(1,  'A',       '01-APR-2006',   '31-MAY-2006');
insert into temp_olap values(1,  'B',       '01-JUN-2006',   '31-AUG-2006');
insert into temp_olap values(1,  'A',       '01-SEP-2006',   '31-OCT-2006');
insert into temp_olap values(2,  'A',       '01-JAN-2006',   '31-MAY-2006');
insert into temp_olap values(2,  'A',       '01-JUN-2006',   '31-JUL-2006');
insert into temp_olap values(2,  'A',       '01-AUG-2006',   '31-OCT-2006');

Then for each row in this data, we need to get the To_Date from the row before (ordered by from_Date) IF that To_date is one day before the From_date on the current row.
Similarly, we need to get the From_Date from the next row (ordered by from_date) IF that from_date is one day greater than the To_Date on the current row:
select id
      ,attrib
      ,from_date
      ,to_date
      ,case when lag(to_date) over (partition by id,attrib order by from_Date) != from_date -1 then null
            else lag(to_date) over (partition by id,attrib order by from_Date) end prev_to_Date
      ,case when lead(from_date)  over (partition by id,attrib order by to_date) != to_date +1 then null
            else lead(from_date)  over (partition by id,attrib order by to_date) end next_from_Date
from   temp_olap;

  ID A FROM_DATE TO_DATE   PREV_TO_D NEXT_FROM
---- - --------- --------- --------- ---------
   1 A 01-JAN-06 31-MAR-06           01-APR-06
   1 A 01-APR-06 31-MAY-06 31-MAR-06
   1 A 01-SEP-06 31-OCT-06
   1 B 01-JUN-06 31-AUG-06
   2 A 01-JAN-06 31-MAY-06           01-JUN-06
   2 A 01-JUN-06 31-JUL-06 31-MAY-06 01-AUG-06
   2 A 01-AUG-06 31-OCT-06 31-JUL-06

Everyone with me so far?

Now we have a dataset where the first row of a set of contiguous rows is marked by having a NULL value in Prev_To_Date, and the last row in a set is marked by a Null Next_From_Date.
So, first we need to ignore rows with no nulls in these columns.
Then, if the Prev_To_Date is null (marks start of set) then we need to get the actual FROM_DATE from that record, otherwise we need to get the FROM_DATE from the previous record (becuase every set of contiguous rows has been reduced to either 1 or 2 rows of data).
Similarly, if the Next_From_Date is null, we need to get the TO_DATE from that column, otherwise we return the TO_DATE from the next record.

After that, each set has been reduced to 1 record or 2 identical records, so we just do a distinct.
select distinct 
       id
      ,attrib
      ,case when prev_to_date is not null then lag(from_date) over (partition by id,attrib order by from_date)
            else from_date  end  from_date
      ,case when next_from_date is not null then lead(to_date)  over (partition by id,attrib order by to_date)
            else to_Date  end to_date
from (            
select id
      ,attrib
      ,from_date
      ,to_date
      ,case when lag(to_date) over (partition by id,attrib order by from_Date) != from_date -1 then null
            else lag(to_date) over (partition by id,attrib order by from_Date) end prev_to_Date
      ,case when lead(from_date)  over (partition by id,attrib order by to_date) != to_date +1 then null
            else lead(from_date)  over (partition by id,attrib order by to_date) end next_from_Date
from   temp_olap)
where prev_to_date is null
or    next_from_Date is null;

     ID A FROM_DATE TO_DATE
------- - --------- ---------
      1 A 01-JAN-06 31-MAY-06
      1 A 01-SEP-06 31-OCT-06
      1 B 01-JUN-06 31-AUG-06
      2 A 01-JAN-06 31-OCT-06

Easy really Cool
Re: Sql - Related data difficult question [message #206709 is a reply to message #206708] Fri, 01 December 2006 03:33 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
To be fair - this counts as an Expert question. Kudos for the modesty in putting it in Newbies.
Re: Sql - Related data difficult question [message #206826 is a reply to message #206598] Fri, 01 December 2006 12:24 Go to previous message
Pastonp
Messages: 13
Registered: November 2006
Junior Member
Well...you have made an old man happy Smile....I am a bit of a sad Geek !

Thank you for taking the time to look at this and for your excellent explanation.

It was one of those cases where you just know it must be possible but just can`t quite find the right combination of joins and functions...lag and lead...who`d have thought of that.

I also spend a lot of time getting everything in the file in the right order and deriving the end dates from the next records start date etc... I was joining the file back to itself on a sequential key eg
select......
from mydata t1 , mydata t2
where tl.key = t2.key +1
and blah...blah 


I then have to do loads of stuff to get the first or last record from each group.

your solution stops all this silliness !!

All I need to do now is to apply the logic it to my very large data table !

Once again thanks for your help
Phil
Previous Topic: Outer Joins
Next Topic: Help loading CLOB data using SQLLDR or Insert statement
Goto Forum:
  


Current Time: Fri Dec 02 12:31:18 CST 2016

Total time taken to generate the page: 0.08805 seconds