Home » SQL & PL/SQL » SQL & PL/SQL » Another example of the "start of group method" (9.2.0.8 Win2003 Server)
Another example of the "start of group method" [message #585379] Fri, 24 May 2013 09:10 Go to next message
Amine
Messages: 265
Registered: March 2010
Senior Member

Hi all,
This is a history table for an employee.
drop table test;
create table test
(
  id    number(2)        ,
  dat_start   date              ,
  dat_end     date              ,
  position    varchar2(20)    ,
  occupation        varchar2(20)
)
/


One row in this table represents a period of time of an employee in the company. An employee can either do something or not.
Here are some rows :
insert into test values (10, to_date('02.10.1999','dd/mm/yyyy'), to_date('31.07.2005','dd/mm/yyyy'), 'Z', 'PAINT');
insert into test values (10, to_date('01.08.2005','dd/mm/yyyy'), to_date('24.09.2008','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('25.09.2008','dd/mm/yyyy'), to_date('02.12.2009','dd/mm/yyyy'), 'B', 'TRAINING');

insert into test values (10, to_date('03.12.2009','dd/mm/yyyy'), to_date('02.09.2010','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('03.09.2010','dd/mm/yyyy'), to_date('02.09.2011','dd/mm/yyyy'), 'A', 'NONE');

insert into test values (10, to_date('03.09.2011','dd/mm/yyyy'), to_date('02.02.2012','dd/mm/yyyy'), 'C', 'WASHING');

insert into test values (10, to_date('03.02.2012','dd/mm/yyyy'), to_date('02.04.2013','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('03.04.2013','dd/mm/yyyy'), null, 'A', 'NONE');


You can remark the 4th and the 5th row in the insert script that they describe the same thing (position=A and occupation=none).
More than that, 5th dat_start - 4th dat_end = 1 day.
So this is what I am looking for, rows like this.
I want to transform them into one. If we issue a select statement on the test table, we have :
SQL> select * from test;

        ID DAT_START  DAT_END    POSITION   OCCUPATION
---------- ---------- ---------- ---------- ----------
        10 02/10/1999 31/07/2005 Z          PAINT
        10 01/08/2005 24/09/2008 A          NONE
        10 25/09/2008 02/12/2009 B          TRAINING
    *   10 03/12/2009 02/09/2010 A          NONE
    *   10 03/09/2010 02/09/2011 A          NONE
        10 03/09/2011 02/02/2012 C          WASHING
    *   10 03/02/2012 02/04/2013 A          NONE
    *   10 03/04/2013            A          NONE


And I want to get this :

        ID DAT_START  DAT_END    POSITION   OCCUPATION
---------- ---------- ---------- ---------- ----------
        10 02/10/1999 31/07/2005 Z          PAINT
        10 01/08/2005 24/09/2008 A          NONE
        10 25/09/2008 02/12/2009 B          TRAINING
    *   10 03/12/2009 02/09/2011 A          NONE
        10 03/09/2011 02/02/2012 C          WASHING
    *   10 03/02/2012            A          NONE


It's been about a year, I've learned a new technique from Barbara Boehmer and Solomon Yakobson called start of group method.


And I used the same philosophy to get it, and I succeeded but I don't know if it is the best way.
Here is the query I used :

column position format A10
column occupation format A10
column grp format 9
column rn1 format 9
column rn format 9
with v as
(
select 
test.*
, lag(dat_end, 1) over (partition by id order by dat_start) dat_end_prev
, lag(position, 1) over (partition by id order by dat_start) position_prev
, lag(occupation, 1) over (partition by id order by dat_start) occupation_prev
-- ----
, lead(dat_start, 1) over (partition by id order by dat_start) dat_start_next
, lead(position, 1) over (partition by id order by dat_start) position_next
, lead(occupation, 1) over (partition by id order by dat_start) occupation_next
from test
order by dat_start
)
, v_grouped as
(
select id
, dat_start
, dat_end
, position
, occupation
, case 
    when  dat_start - dat_end_prev = 1
          and position = position_prev
          and occupation = occupation_prev
          or
          dat_start_next - dat_end = 1
          and position = position_next
          and occupation = occupation_next
          then 1
    else 0
  end grp
from v
)
, v_ranked as
(
select v_grouped.*
, decode(grp, 0, row_number() over (partition by id order by dat_start)
, row_number() over (partition by id order by dat_start)
- row_number() over (partition by id, grp order by dat_start) + grp) rn
from v_grouped
order by dat_start
)
select id
, min(dat_start) dat_start
, nullif(max(nvl(dat_end, sysdate + 1000)), sysdate + 1000) dat_end
, position
, occupation
from v_ranked
group by id, rn
, position
, occupation
order by dat_start
/


Can anyone could help to get something more optimal ?

Thanks in advance,

Amine
Re: Another example of the "start of group method" [message #585781 is a reply to message #585379] Wed, 29 May 2013 18:10 Go to previous messageGo to next message
Barbara Boehmer
Messages: 7995
Registered: November 2002
Location: California, USA
Senior Member
I would offer the same method as I offered in your previous thread. You might test and compare and let us know.

SCOTT@orcl_11gR2> select * from test
  2  /

        ID DAT_START   DAT_END     POSITION             OCCUPATION
---------- ----------- ----------- -------------------- --------------------
        10 02/Oct/1999 31/Jul/2005 Z                    PAINT
        10 01/Aug/2005 24/Sep/2008 A                    NONE
        10 25/Sep/2008 02/Dec/2009 B                    TRAINING
        10 03/Dec/2009 02/Sep/2010 A                    NONE
        10 03/Sep/2010 02/Sep/2011 A                    NONE
        10 03/Sep/2011 02/Feb/2012 C                    WASHING
        10 03/Feb/2012 02/Apr/2013 A                    NONE
        10 03/Apr/2013             A                    NONE

8 rows selected.

SCOTT@orcl_11gR2> select id, min (dat_start) dat_start, max (dat_end) dat_end, position, occupation
  2  from   (select test.*,
  3  		    row_number () over
  4  		      (order by id, dat_start, rowid) -
  5  		    row_number () over
  6  		      (partition by id, position, occupation
  7  		       order by dat_start, rowid) rn
  8  	     from   test)
  9  group  by id, position, occupation, rn
 10  order  by dat_start
 11  /

        ID DAT_START   DAT_END     POSITION             OCCUPATION
---------- ----------- ----------- -------------------- --------------------
        10 02/Oct/1999 31/Jul/2005 Z                    PAINT
        10 01/Aug/2005 24/Sep/2008 A                    NONE
        10 25/Sep/2008 02/Dec/2009 B                    TRAINING
        10 03/Dec/2009 02/Sep/2011 A                    NONE
        10 03/Sep/2011 02/Feb/2012 C                    WASHING
        10 03/Feb/2012 02/Apr/2013 A                    NONE

6 rows selected.

Re: Another example of the "start of group method" [message #585948 is a reply to message #585781] Fri, 31 May 2013 05:59 Go to previous messageGo to next message
Amine
Messages: 265
Registered: March 2010
Senior Member

Thank you for the feedback.

There's a little error in your final result. It should be :
        ID DAT_START   DAT_END     POSITION             OCCUPATION
---------- ----------- ----------- -------------------- --------------------
        10 02/Oct/1999 31/Jul/2005 Z                    PAINT
        10 01/Aug/2005 24/Sep/2008 A                    NONE
        10 25/Sep/2008 02/Dec/2009 B                    TRAINING
        10 03/Dec/2009 02/Sep/2011 A                    NONE
        10 03/Sep/2011 02/Feb/2012 C                    WASHING
        10 03/Feb/2012             A                    NONE


And I used to do this to get it :

column position format A10
column occupation format A10
column grp format 9
column rn1 format 9
column rn format 9
select 
id, min (dat_start) dat_start, nullif(max (dat_end), sysdate + 1000) dat_end, position, occupation
from
(
select 
id, position, occupation, dat_start, nvl(dat_end, sysdate + 1000) dat_end,
row_number () over (order by id, dat_start, rowid) -
row_number () over (partition by id, position, occupation order by dat_start, rowid) rn from test
)
group  by id, position, occupation, rn
order  by dat_start
/


When dat_end is null it means until now.
2nd remark :
From your query, could we have something like :
SQL> column position format A10
SQL> column occupation format A10
SQL> column grp format 9
SQL> column rn1 format 9
SQL> column rn format 9
SQL> select
  2  *
  3  from
  4  (
  5  select
  6  id, position, occupation, dat_start, nvl(dat_end, sysdate + 1000) dat_end,
  7  row_number () over (order by id, dat_start, rowid) -
  8  row_number () over (partition by id, position, occupation order by dat_start, rowid) rn from test
  9  )
 10  order  by dat_start
 11  /

        ID POSITION   OCCUPATION DAT_STAR DAT_END  RN
---------- ---------- ---------- -------- -------- --
        10 Z          PAINT      02/10/99 31/07/05  0
        10 A          NONE       01/08/05 24/09/08  1
        10 B          TRAINING   25/09/08 02/12/09  2
        10 A          NONE       03/12/09 02/09/10  3
        10 A          NONE       03/09/10 02/09/11  3
        10 C          WASHING    03/09/11 02/02/12  4
        10 A          NONE       03/02/12 02/04/13  5
        10 A          NONE       03/04/13 25/02/16  5

8 ligne(s) sÚlectionnÚe(s).

SQL>

3rd remark :
Suppose we have this initial data :
insert into test values (10, to_date('02.10.1999','dd/mm/yyyy'), to_date('31.07.2005','dd/mm/yyyy'), 'Z', 'PAINT');
insert into test values (10, to_date('01.08.2005','dd/mm/yyyy'), to_date('24.09.2008','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('25.09.2008','dd/mm/yyyy'), to_date('02.12.2009','dd/mm/yyyy'), 'B', 'TRAINING');

insert into test values (10, to_date('03.12.2009','dd/mm/yyyy'), to_date('02.09.2010','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('05.09.2010','dd/mm/yyyy'), to_date('02.09.2011','dd/mm/yyyy'), 'A', 'NONE');

insert into test values (10, to_date('03.09.2011','dd/mm/yyyy'), to_date('02.02.2012','dd/mm/yyyy'), 'C', 'WASHING');

insert into test values (10, to_date('03.02.2012','dd/mm/yyyy'), to_date('02.04.2013','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('03.04.2013','dd/mm/yyyy'), null, 'A', 'NONE');


You can remark that the 4th and 5th rows dates(4th dat_end and 5th dat_start) don't have a difference of one day, and in this case the out put should be :

        ID DAT_STAR DAT_END  POSITION   OCCUPATION
---------- -------- -------- ---------- ----------
        10 02/10/99 31/07/05 Z          PAINT
        10 01/08/05 24/09/08 A          NONE
        10 25/09/08 02/12/09 B          TRAINING
        10 03/12/09 02/09/10 A          NONE
        10 05/09/10 02/09/11 A          NONE
        10 03/09/11 02/02/12 C          WASHING
        10 03/02/12          A          NONE


Thank you Barbara !

AMine
Re: Another example of the "start of group method" [message #585957 is a reply to message #585379] Fri, 31 May 2013 07:49 Go to previous messageGo to next message
DrabJay
Messages: 32
Registered: May 2013
Member
I have also looked at the method you mentioned, but have found it does not give me the answers I expect when the are overlapping or missing intervals. Therefore, I use a hierarchical query method which, although not as performant, does give me the answers I expect e.g.

SELECT id
     , dat_start
     , max(dat_end) KEEP (DENSE_RANK LAST ORDER BY dat_end NULLS LAST) dat_end
     , position
     , occupation
FROM (SELECT id
           , position
           , occupation
           , CONNECT_BY_ROOT dat_start dat_start
           , dat_end
      FROM (SELECT id
                 , position
                 , occupation
                 , dat_start
                 , dat_end
                 , min(dat_start) OVER (PARTITION BY id, position, occupation) min_dat_start
                 , lag(dat_end) OVER (PARTITION BY id, position, occupation ORDER BY dat_start) lag_dat_end
            FROM test)
      START WITH (  dat_start = min_dat_start
                 OR dat_start > lag_dat_end + 1)
      CONNECT BY id = PRIOR id
      AND position = PRIOR position
      AND occupation = PRIOR occupation
      AND dat_start > PRIOR dat_start
      AND (  dat_start <= PRIOR dat_end + 1
          OR PRIOR dat_end IS NULL))
GROUP BY id
       , position
       , occupation
       , dat_start
ORDER BY id
       , dat_start
Re: Another example of the "start of group method" [message #585990 is a reply to message #585957] Fri, 31 May 2013 13:46 Go to previous messageGo to next message
Amine
Messages: 265
Registered: March 2010
Senior Member

I am in 9.2.0.8 connect_by_root does not work.
Re: Another example of the "start of group method" [message #585994 is a reply to message #585990] Fri, 31 May 2013 13:53 Go to previous messageGo to next message
Michel Cadot
Messages: 59427
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I didn't read the topic and solution but I showed you how to workaround this 9i connect_by_root issue there. Wink

Regards
Michel
Re: Another example of the "start of group method" [message #585996 is a reply to message #585994] Fri, 31 May 2013 16:12 Go to previous messageGo to next message
Barbara Boehmer
Messages: 7995
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_11gR2> select * from test order by dat_start
  2  /

        ID DAT_START   DAT_END     POSITION             OCCUPATION
---------- ----------- ----------- -------------------- --------------------
        10 02/Oct/1999 31/Jul/2005 Z                    PAINT
        10 01/Aug/2005 24/Sep/2008 A                    NONE
        10 25/Sep/2008 02/Dec/2009 B                    TRAINING
        10 03/Dec/2009 02/Sep/2010 A                    NONE
        10 05/Sep/2010 02/Sep/2011 A                    NONE
        10 03/Sep/2011 02/Feb/2012 C                    WASHING
        10 03/Feb/2012 02/Apr/2013 A                    NONE
        10 03/Apr/2013             A                    NONE

8 rows selected.

SCOTT@orcl_11gR2> select t1.id, t1.dat_start,
  2  	    nullif
  3  	      ((select max (nvl (t2.dat_end, sysdate + 1000))
  4  		from   test t2
  5  		start with t2.id = t1.id
  6  		      and  t2.dat_start = t1.dat_start
  7  		      and  t2.position = t1.position
  8  		      and  t2.occupation = t1.occupation
  9  		connect by prior nvl (t2.dat_end, sysdate + 1000) + 1 = t2.dat_start
 10  		       and prior t2.id = t2.id
 11  		       and prior t2.position = t2.position
 12  		       and prior t2.occupation = t2.occupation),
 13  	       sysdate + 1000) dat_end,
 14  	    t1.position, t1.occupation
 15  from   test t1
 16  where  (t1.id, t1.dat_start, t1.position, t1.occupation) not in
 17  	    (select t3.id, nvl (t3.dat_end, sysdate + 1000) + 1, t3.position, t3.occupation
 18  	     from   test t3)
 19  order   by dat_start
 20  /

        ID DAT_START   DAT_END     POSITION             OCCUPATION
---------- ----------- ----------- -------------------- --------------------
        10 02/Oct/1999 31/Jul/2005 Z                    PAINT
        10 01/Aug/2005 24/Sep/2008 A                    NONE
        10 25/Sep/2008 02/Dec/2009 B                    TRAINING
        10 03/Dec/2009 02/Sep/2010 A                    NONE
        10 05/Sep/2010 02/Sep/2011 A                    NONE
        10 03/Sep/2011 02/Feb/2012 C                    WASHING
        10 03/Feb/2012             A                    NONE

7 rows selected.

Re: Another example of the "start of group method" [message #586016 is a reply to message #585996] Sat, 01 June 2013 07:53 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2049
Registered: January 2010
Senior Member
Barbara,

Your original solution is much better. All it needs is out of all end dates within group we want not the MAX but one that specified with MAX start date. This can be done using KEEP:

with t as (
           select  test.*,
		   row_number() over(order by id, dat_start, rowid) -
  		   row_number() over(partition by id, position, occupation
  		                     order by dat_start, rowid
                                    ) rn
  	     from  test
          )
select  id,
        min(dat_start) dat_start,
        max(dat_end) keep(dense_rank last order by dat_start) dat_end,
        position,
        occupation
  from  t
  group  by id,
            position,
            occupation,
            rn
  order  by dat_start
/

 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 02-SEP-11 A        NONE
 10 03-SEP-11 02-FEB-12 C        WASHING
 10 03-FEB-12           A        NONE

6 rows selected.

SQL>


SY.
Re: Another example of the "start of group method" [message #586017 is a reply to message #585379] Sat, 01 June 2013 11:58 Go to previous messageGo to next message
DrabJay
Messages: 32
Registered: May 2013
Member
Unfortunately, this method does not work when there is a missing interval.

select id,
       dat_start,
       dat_end,
       position,
       occupation
from test
order by id,
         dat_start

        ID DAT_START   DAT_END     POSITION OCCUPATION
---------- ----------- ----------- -------- ----------
        10 02-OCT-1999 31-JUL-2005 Z        PAINT
        10 01-AUG-2005 24-SEP-2008 A        NONE
        10 25-SEP-2008 02-DEC-2009 B        TRAINING
        10 03-DEC-2009 02-SEP-2010 A        NONE
        10 05-SEP-2010 02-SEP-2011 A        NONE
        10 03-SEP-2011 02-FEB-2012 C        WASHING
        10 03-FEB-2012 02-APR-2013 A        NONE
        10 03-APR-2013             A        NONE


with t as (
           select test.*,
                  row_number() over(order by id, dat_start, rowid) -
                    row_number() over(partition by id, position, occupation order by dat_start, rowid) rn
           from test
          )
select  id,
        min(dat_start) dat_start,
        max(dat_end) keep(dense_rank last order by dat_start) dat_end,
        position,
        occupation
from t
group by id,
         position,
         occupation,
         rn
order by dat_start

        ID DAT_START   DAT_END     POSITION OCCUPATION
---------- ----------- ----------- -------- ----------
        10 02-OCT-1999 31-JUL-2005 Z        PAINT
        10 01-AUG-2005 24-SEP-2008 A        NONE
        10 25-SEP-2008 02-DEC-2009 B        TRAINING
        10 03-DEC-2009 02-SEP-2011 A        NONE
        10 03-SEP-2011 02-FEB-2012 C        WASHING
        10 03-FEB-2012             A        NONE


In this case a hierarchical method is required in order to give the expected answer.
Re: Another example of the "start of group method" [message #586018 is a reply to message #586016] Sat, 01 June 2013 12:51 Go to previous messageGo to next message
Barbara Boehmer
Messages: 7995
Registered: November 2002
Location: California, USA
Senior Member
Solomon,

You might not have noticed that Amine supplied a new data set and new requirement that to be combined the dat_start must be equal to the dat_end + 1 of the previous row in addition to the other requirements. That was why I resorted to the hierarchical query.

Barbara
Re: Another example of the "start of group method" [message #586019 is a reply to message #586018] Sat, 01 June 2013 14:24 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2049
Registered: January 2010
Senior Member
Yes, I missed that. Still your original solution stands. All we need is to add date continuity into the mix:


 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 02-SEP-10 A        NONE
 10 05-SEP-10 02-SEP-11 A        NONE
 10 03-SEP-11 02-FEB-12 C        WASHING
 10 03-FEB-12 02-APR-13 A        NONE
 10 03-APR-13           A        NONE

8 rows selected.

with t as (
           select  test.*,
		   row_number() over(order by id, dat_start, rowid) -
  		   row_number() over(partition by id, position, occupation
  		                     order by dat_start, rowid
                                    ) rn,
                    dat_start - lag(dat_end,1,dat_start - 1) over(order by id, dat_start, rowid) continuity
  	     from  test
          )
select  id,
        min(dat_start) dat_start,
        max(dat_end) keep(dense_rank last order by dat_start) dat_end,
        position,
        occupation
  from  t
  group  by id,
            position,
            occupation,
            rn,
            continuity
  order  by dat_start
/

 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 02-SEP-10 A        NONE
 10 05-SEP-10 02-SEP-11 A        NONE
 10 03-SEP-11 02-FEB-12 C        WASHING
 10 03-FEB-12           A        NONE

7 rows selected.

SQL>


SY.
Re: Another example of the "start of group method" [message #586020 is a reply to message #586019] Sat, 01 June 2013 16:10 Go to previous messageGo to next message
Solomon Yakobson
Messages: 2049
Registered: January 2010
Senior Member
Oops,

It will not work if gap is followed by multiple consecutive rows with same position as before gap:

insert into test values (10, to_date('02.10.1999','dd/mm/yyyy'), to_date('31.07.2005','dd/mm/yyyy'), 'Z', 'PAINT');
insert into test values (10, to_date('01.08.2005','dd/mm/yyyy'), to_date('24.09.2008','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('25.09.2008','dd/mm/yyyy'), to_date('02.12.2009','dd/mm/yyyy'), 'B', 'TRAINING');

insert into test values (10, to_date('03.12.2009','dd/mm/yyyy'), to_date('02.09.2010','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('05.09.2010','dd/mm/yyyy'), to_date('02.09.2011','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('03.09.2011','dd/mm/yyyy'), to_date('03.11.2011','dd/mm/yyyy'), 'A', 'NONE');

insert into test values (10, to_date('03.12.2011','dd/mm/yyyy'), to_date('02.02.2012','dd/mm/yyyy'), 'C', 'WASHING');

insert into test values (10, to_date('03.02.2012','dd/mm/yyyy'), to_date('02.04.2013','dd/mm/yyyy'), 'A', 'NONE');
insert into test values (10, to_date('03.04.2013','dd/mm/yyyy'), null, 'A', 'NONE');


Then my solution incorrectly returns:

with t as (
           select  test.*,
		   row_number() over(order by id, dat_start, rowid) -
  		   row_number() over(partition by id, position, occupation
  		                     order by dat_start, rowid
                                    ) rn,
                    dat_start - lag(dat_end,1,dat_start - 1) over(order by id, dat_start, rowid) continuity
  	     from  test
          )
select  id,
        min(dat_start) dat_start,
        max(dat_end) keep(dense_rank last order by dat_start) dat_end,
        position,
        occupation
  from  t
  group  by id,
            position,
            occupation,
            rn,
            continuity
  order  by dat_start
/

 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 03-NOV-11 A        NONE
 10 05-SEP-10 02-SEP-11 A        NONE
 10 03-DEC-11 02-FEB-12 C        WASHING
 10 03-FEB-12           A        NONE

7 rows selected.

SQL>


Anyway, back to traditional start-of-group method solution:

with t1 as (
            select  test.*,
                    case
                      when lag(position,1,position) over(order by id,dat_start,rowid) != position then 1
		      when lag(dat_end,1,dat_start - 1) over(order by id,dat_start,rowid) != dat_start - 1 then 1
                      else 0
                    end start_of_group
  	      from  test
           ),
     t2 as (
            select  t1.*,
		    sum(start_of_group) over(order by id,dat_start,rowid) grp
  	     from  t1
          )
select  id,
        min(dat_start) dat_start,
        max(dat_end) keep(dense_rank last order by dat_start) dat_end,
        position,
        occupation
  from  t2
  group  by id,
            position,
            occupation,
            grp
  order  by dat_start
/

 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 02-SEP-10 A        NONE
 10 05-SEP-10 03-NOV-11 A        NONE
 10 03-DEC-11 02-FEB-12 C        WASHING
 10 03-FEB-12           A        NONE

7 rows selected.

SQL>


And below are results for OP's data:

SQL> select  *
  2    from  test
  3  /

 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 02-SEP-10 A        NONE
 10 05-SEP-10 02-SEP-11 A        NONE
 10 03-SEP-11 02-FEB-12 C        WASHING
 10 03-FEB-12 02-APR-13 A        NONE
 10 03-APR-13           A        NONE

8 rows selected.

SQL> with t1 as (
  2              select  test.*,
  3                      case
  4                        when lag(position,1,position) over(order by id,dat_start,rowid) != position then 1
  5                   when lag(dat_end,1,dat_start - 1) over(order by id,dat_start,rowid) != dat_start - 1 then 1
  6                        else 0
  7                      end start_of_group
  8           from  test
  9             ),
 10       t2 as (
 11              select  t1.*,
 12                 sum(start_of_group) over(order by id,dat_start,rowid) grp
 13          from  t1
 14            )
 15  select  id,
 16          min(dat_start) dat_start,
 17          max(dat_end) keep(dense_rank last order by dat_start) dat_end,
 18          position,
 19          occupation
 20    from  t2
 21    group  by id,
 22              position,
 23              occupation,
 24              grp
 25    order  by dat_start
 26  /

 ID DAT_START DAT_END   POSITION OCCUPATION
--- --------- --------- -------- ----------
 10 02-OCT-99 31-JUL-05 Z        PAINT
 10 01-AUG-05 24-SEP-08 A        NONE
 10 25-SEP-08 02-DEC-09 B        TRAINING
 10 03-DEC-09 02-SEP-10 A        NONE
 10 05-SEP-10 02-SEP-11 A        NONE
 10 03-SEP-11 02-FEB-12 C        WASHING
 10 03-FEB-12           A        NONE

7 rows selected.

SQL>


SY.
Re: Another example of the "start of group method" [message #586031 is a reply to message #586020] Sun, 02 June 2013 14:02 Go to previous message
Amine
Messages: 265
Registered: March 2010
Senior Member

Another trick from Barbara and Solomon, thanks again
Previous Topic: cross database join on fields with mismatched data types
Next Topic: Getting Distinct values string using LISTAGG
Goto Forum:
  


Current Time: Fri Oct 24 18:31:55 CDT 2014

Total time taken to generate the page: 0.22173 seconds