Home » SQL & PL/SQL » SQL & PL/SQL » How to combine rows that meet criteria in SQL (Oracle)
How to combine rows that meet criteria in SQL [message #321997] Thu, 22 May 2008 02:07 Go to next message
aegcp
Messages: 8
Registered: May 2008
Junior Member
Hi guys, I need to consolidate (merge rows) that have:

- the same usr_id and dep_id,
- the start_date of the row is equal to the end_date of the other row.

I have created some test data as well as my basic attempt to achieve the result, but I'm not getting all the expected results

create table test2
(usr_id varchar2(40),
start_time date,
end_time date,
dep_id number);


insert into test2 (usr_id, start_time, end_time, dep_id)
 values (123, to_date('2003-07-15', 'yyyy-mm-dd'), to_date('2003-07-16', 'yyyy-mm-dd'), 1);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (123, to_date('2003-07-16', 'yyyy-mm-dd'), to_date('2003-07-17', 'yyyy-mm-dd'), 1);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (548, to_date('2003-07-15', 'yyyy-mm-dd'), to_date('2003-07-16', 'yyyy-mm-dd'), 2);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (549, to_date('2003-07-18', 'yyyy-mm-dd'), to_date('2003-07-19', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (549, to_date('2003-07-17', 'yyyy-mm-dd'), to_date('2003-07-18', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (550, to_date('2003-07-19', 'yyyy-mm-dd'), to_date('2003-07-20', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (550, to_date('2003-07-22', 'yyyy-mm-dd'), to_date('2003-07-24', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (600, to_date('2003-07-26', 'yyyy-mm-dd'), to_date('2003-07-27', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (600, to_date('2003-07-25', 'yyyy-mm-dd'), to_date('2003-07-26', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (600, to_date('2003-07-27', 'yyyy-mm-dd'), to_date('2003-07-28', 'yyyy-mm-dd'), 3);

SQL> select * from test2;

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03 16-JUL-03          1
123                                      16-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      18-JUL-03 19-JUL-03          3
549                                      17-JUL-03 18-JUL-03          3
550                                      19-JUL-03 20-JUL-03          3
550                                      22-JUL-03 24-JUL-03          3

7 rows selected.



I'm expecting the following results:
USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      17-JUL-03 19-JUL-03          3
550                                      19-JUL-03 20-JUL-03          3
550                                      22-JUL-03 24-JUL-03          3

If I run my sql attempt,
SQL> SELECT  usr_id, MIN(START_TIME) START_TIME,
  2  MAX(END_TIME) END_TIME,
  3  DEP_ID
  4  FROM TEST2
  5  GROUP BY USR_ID, DEP_ID
  6  order by usr_id;

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      17-JUL-03 19-JUL-03          3
550                                      19-JUL-03 24-JUL-03          3
600                                      25-JUL-03 28-JUL-03          3


It's Incorrect, because rows with usr_id = 550 were merged (the end_time for 1 row (19 July 03) was not equal to start_time of the 2 row (22 July 03).

Anyone can provide a suggestion, idea?

thanks!

[Updated on: Thu, 22 May 2008 06:58] by Moderator

Report message to a moderator

Re: How to combine rows that meet criteria in SQL [message #321998 is a reply to message #321997] Thu, 22 May 2008 02:19 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Have look at LEAD/LAG functions.

Here's the same kind of question:
http://www.dba-village.com/village/dvp_forum.OpenThread?ThreadIdA=8785#33246

Regards
Michel
Re: How to combine rows that meet criteria in SQL [message #322078 is a reply to message #321998] Thu, 22 May 2008 06:55 Go to previous messageGo to next message
aegcp
Messages: 8
Registered: May 2008
Junior Member
Thanks Michel, I've had a look at your explanation.

I'm confused. To start off, shouldn't I be ordering by usr_id and dep_id?

SQL> SELECT usr_id, start_time, end_time, dep_id
  2      from test2
  3      order by usr_id, dep_id;

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03 16-JUL-03          1
123                                      16-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      18-JUL-03 19-JUL-03          3
549                                      17-JUL-03 18-JUL-03          3
550                                      19-JUL-03 20-JUL-03          3
550                                      22-JUL-03 24-JUL-03          3

7 rows selected.


I'm ordering by usr_id and dep_id because these columns are the ones that restrict how I "merge" the rows. If the values for this row are the same as the others I'm evaluating, then it is a candidate to consolidate the dates. Is this correct, or am I lost?
Re: How to combine rows that meet criteria in SQL [message #322080 is a reply to message #322078] Thu, 22 May 2008 07:00 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
You partition your set on (usr_id,dep_id) and order it by start_tim, if the previous end_tim is equal to the current start_tim then you have to merge the rows.

Regards
Michel
Re: How to combine rows that meet criteria in SQL [message #322228 is a reply to message #322080] Thu, 22 May 2008 18:46 Go to previous messageGo to next message
aegcp
Messages: 8
Registered: May 2008
Junior Member
Thank you.

I thought of getting the start time and end times merged when they were contiguous, and after that adding the conditions that the rows should only be merged when the usr_id = usr_id of the compared row and dep_id = dep_id of the compared row. Could you please have a look at this and tell me if I'm totally nuts by doing this or not?

SQL> SELECT
  2      usr_id,
  3      min_date as start_time,
  4      max_date as end_time,
  5      dep_id
  6    FROM
  7      (
  8        SELECT
  9          usr_id,
 10          start_time,
 11          end_time,
 12     dep_id,
 13          connect_by_isleaf isleaf,
 14          level levelof,
 15          TO_DATE( SUBSTR( SYS_CONNECT_BY_PATH( 
TO_CHAR( start_time, 'MMDDYYYY' ), '-' ), 2, 8 ), 'MMDDYYYY' ) min_date,
 16          ( CASE WHEN CONNECT_BY_ISLEAF = 1 THEN end_time END ) max_date
 17        FROM
 18          (
 19            SELECT
 20              next_id,
 21              usr_id,
 22              start_time,
 23              end_time,
 24              dep_id,
 25              ( CASE
 26                  WHEN start_time = LAG( end_time  ) 
OVER ( PARTITION BY usr_id ORDER BY end_time ) THEN
 27                    LAG( next_id ) 
OVER ( PARTITION BY usr_id ORDER BY end_time )
 28                END ) contiguous
 29            FROM
 30              (
 31                SELECT usr_id, rownum next_id, start_time, end_time, dep_id
 32                FROM test2
 33              )
 34          )
 35          START WITH contiguous IS NULL
 36          CONNECT BY PRIOR next_id = contiguous
 37      );

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03                    1
123                                      15-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      17-JUL-03                    3
549                                      17-JUL-03 19-JUL-03          3
550                                      19-JUL-03 20-JUL-03          3
550                                      22-JUL-03 24-JUL-03          3

7 rows selected.


Of course I've got issues here, it is consolidating contiguous rows OK but it's leaving the rows with end date null. sigh. I used connect by for traversing and comparing the start_time=end time of the following row.

[Updated on: Thu, 22 May 2008 23:04] by Moderator

Report message to a moderator

Re: How to combine rows that meet criteria in SQL [message #322248 is a reply to message #322228] Thu, 22 May 2008 22:50 Go to previous messageGo to next message
aegcp
Messages: 8
Registered: May 2008
Junior Member
OK, I've refined my query:


SQL> select
  2     usr_id, min(start_time) start_time, max(end_time) end_time, dep_id
  3     from
  4     (
  5             select a.*, start_time - prev_end_time g
  6             from
  7             (
  8                     select start_time,lead(start_time) 
over (order by start_time) next_start_time,
  9                                lag(end_time,1,(start_time)) 
over (order by end_time) prev_end_time, end_time,usr_id, dep_id
 10                     from test2 order by end_time
 11             )
 12               a
 13     )
 14             group by usr_id, dep_id, g
 15  order by 1;

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      17-JUL-03 19-JUL-03          3
550                                      19-JUL-03 20-JUL-03          3
550                                      22-JUL-03 24-JUL-03          3



This SQL seems to be doing the job OK. BUT, If I add 3 rows more to the table

insert into test2 (usr_id, start_time, end_time, dep_id)
 values (600, to_date('2003-07-30', 'yyyy-mm-dd'), to_date('2003-07-31', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (600, to_date('2003-07-27', 'yyyy-mm-dd'), to_date('2003-07-28', 'yyyy-mm-dd'), 3);
insert into test2 (usr_id, start_time, end_time, dep_id)
 values (600, to_date('2003-07-29', 'yyyy-mm-dd'), to_date('2003-07-30', 'yyyy-mm-dd'), 3);


Only 2 of these rows should consolidate, 2003-07-29 to 2003-07-31 (start date equals to end date and usr_id and dep_id are the same).

BUT, if I run the query I get:


After executing my query, I get:

SQL> select
  2     usr_id, min(start_time) start_time, max(end_time) end_time, dep_id
  3     from
  4     (
  5             select a.*, start_time - prev_end_time g
  6             from
  7             (
  8                     select start_time,lead(start_time) 
over (order by start_time) next_start_time,
  9                                lag(end_time,1,(start_time)) 
over (order by end_time) prev_end_time, end_time,usr_id, dep_id
 10                     from test2 order by end_time
 11             )
 12               a
 13     )
 14             group by usr_id, dep_id, g
 15  order by 1;

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
123                                      15-JUL-03 17-JUL-03          1
548                                      15-JUL-03 16-JUL-03          2
549                                      17-JUL-03 19-JUL-03          3
550                                      19-JUL-03 20-JUL-03          3
550                                      22-JUL-03 24-JUL-03          3
600                                      30-JUL-03 31-JUL-03          3
600                                      29-JUL-03 30-JUL-03          3
600                                      27-JUL-03 28-JUL-03          3

8 rows selected.


What am I doing wrong? It didn't merge the rows (27 Jul - 31 july)

[Updated on: Thu, 22 May 2008 23:03] by Moderator

Report message to a moderator

Re: How to combine rows that meet criteria in SQL [message #322316 is a reply to message #322248] Fri, 23 May 2008 01:42 Go to previous messageGo to next message
aegcp
Messages: 8
Registered: May 2008
Junior Member
sorry, typo, the sql should've merged the rows

USR_ID                                   START_TIM END_TIME      DEP_ID
---------------------------------------- --------- --------- ----------
600                                      30-JUL-03 31-JUL-03          
600                                      29-JUL-03 30-JUL-03         

Re: How to combine rows that meet criteria in SQL [message #322325 is a reply to message #322248] Fri, 23 May 2008 02:00 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
In the link I posted, there is also a link to the explaination how it works, did you follow it?
Too long to explain it again, here's a solution, execute each to see how it works:
SQL> select * from test2 order by usr_id, dep_id, start_time;
USR_ID                                   START_TIME END_TIME       DEP_ID
---------------------------------------- ---------- ---------- ----------
123                                      15/07/2003 16/07/2003          1
123                                      16/07/2003 17/07/2003          1
548                                      15/07/2003 16/07/2003          2
549                                      17/07/2003 18/07/2003          3
549                                      18/07/2003 19/07/2003          3
550                                      19/07/2003 20/07/2003          3
550                                      22/07/2003 24/07/2003          3
600                                      25/07/2003 26/07/2003          3
600                                      26/07/2003 27/07/2003          3
600                                      27/07/2003 28/07/2003          3
600                                      29/07/2003 30/07/2003          3
600                                      30/07/2003 31/07/2003          3

12 rows selected.

SQL> with
  2    data as (
  3      select usr_id, start_time, end_time, dep_id,
  4             case 
  5               when lag(end_time,1,sysdate)
  6                      over (partition by usr_id, dep_id order by start_time)
  7                    != start_time
  8                 then row_number () over (order by start_time)
  9             end flag
 10      from test2
 11    ),
 12    grouping as (
 13      select usr_id, start_time, end_time, dep_id,
 14             max(flag) over (partition by usr_id, dep_id order by start_time) grp
 15    from data
 16    )
 17  select usr_id, min(start_time) start_time, max(end_time) end_time, dep_id
 18  from grouping
 19  group by usr_id, dep_id, grp
 20  order by usr_id, dep_id, start_time
 21  /
USR_ID                                   START_TIME END_TIME       DEP_ID
---------------------------------------- ---------- ---------- ----------
123                                      15/07/2003 17/07/2003          1
548                                      15/07/2003 16/07/2003          2
549                                      17/07/2003 19/07/2003          3
550                                      19/07/2003 20/07/2003          3
550                                      22/07/2003 24/07/2003          3
600                                      25/07/2003 28/07/2003          3
600                                      29/07/2003 31/07/2003          3

7 rows selected.

Regards
Michel
Re: How to combine rows that meet criteria in SQL [message #322326 is a reply to message #322248] Fri, 23 May 2008 02:00 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
In the link I posted, there is also a link to the explaination how it works, did you follow it?
Too long to explain it again, here's a solution, execute each step to see how it works:
SQL> select * from test2 order by usr_id, dep_id, start_time;
USR_ID                                   START_TIME END_TIME       DEP_ID
---------------------------------------- ---------- ---------- ----------
123                                      15/07/2003 16/07/2003          1
123                                      16/07/2003 17/07/2003          1
548                                      15/07/2003 16/07/2003          2
549                                      17/07/2003 18/07/2003          3
549                                      18/07/2003 19/07/2003          3
550                                      19/07/2003 20/07/2003          3
550                                      22/07/2003 24/07/2003          3
600                                      25/07/2003 26/07/2003          3
600                                      26/07/2003 27/07/2003          3
600                                      27/07/2003 28/07/2003          3
600                                      29/07/2003 30/07/2003          3
600                                      30/07/2003 31/07/2003          3

12 rows selected.

SQL> with
  2    data as (
  3      select usr_id, start_time, end_time, dep_id,
  4             case 
  5               when lag(end_time,1,sysdate)
  6                      over (partition by usr_id, dep_id order by start_time)
  7                    != start_time
  8                 then row_number () over (order by start_time)
  9             end flag
 10      from test2
 11    ),
 12    grouping as (
 13      select usr_id, start_time, end_time, dep_id,
 14             max(flag) over (partition by usr_id, dep_id order by start_time) grp
 15    from data
 16    )
 17  select usr_id, min(start_time) start_time, max(end_time) end_time, dep_id
 18  from grouping
 19  group by usr_id, dep_id, grp
 20  order by usr_id, dep_id, start_time
 21  /
USR_ID                                   START_TIME END_TIME       DEP_ID
---------------------------------------- ---------- ---------- ----------
123                                      15/07/2003 17/07/2003          1
548                                      15/07/2003 16/07/2003          2
549                                      17/07/2003 19/07/2003          3
550                                      19/07/2003 20/07/2003          3
550                                      22/07/2003 24/07/2003          3
600                                      25/07/2003 28/07/2003          3
600                                      29/07/2003 31/07/2003          3

7 rows selected.

Regards
Michel

[Edit: Add a missing word]

[Updated on: Fri, 23 May 2008 10:40]

Report message to a moderator

Re: How to combine rows that meet criteria in SQL [message #322441 is a reply to message #322326] Fri, 23 May 2008 09:42 Go to previous messageGo to next message
harrysmall3
Messages: 109
Registered: April 2008
Location: Massachusetts
Senior Member
Good mornings,

Wow! I started working on this last nite and come back to
this far more sophisticated approach than my
construction below which I am not sure how solid it is beyond
the current scope.


SELECT t.usr_id,
       t.start_time,
       t.end_time,
       t.dep_id
FROM   test2 t
WHERE  NOT EXISTS (
          SELECT 1
          FROM   (SELECT a.usr_id,
                         a.start_time,
                         b.end_time,
                         a.dep_id
                  FROM   test2 a,
                         test2 b
                  WHERE  a.end_time = b.start_time
                  AND    a.usr_id = b.usr_id
                  AND    a.dep_id = b.dep_id) l
          WHERE  t.usr_id = l.usr_id
          AND    t.dep_id = l.dep_id
          AND    (   t.start_time = l.start_time
                  OR t.end_time = l.end_time))
UNION
SELECT *
FROM   (SELECT a.usr_id,
               a.start_time,
               b.end_time,
               a.dep_id
        FROM   test2 a,
               test2 b
        WHERE  a.end_time = b.start_time
        AND    a.usr_id = b.usr_id
        AND    a.dep_id = b.dep_id) linked

 

 
USR_ID	START_TIME	         END_TIME	    DEP_ID
123	07/15/2003 00:00:00	07/17/2003 00:00:00	1
548	07/15/2003 00:00:00	07/16/2003 00:00:00	2
549	07/17/2003 00:00:00	07/19/2003 00:00:00	3
550	07/19/2003 00:00:00	07/20/2003 00:00:00	3
550	07/22/2003 00:00:00	07/24/2003 00:00:00	3
600	07/25/2003 00:00:00	07/27/2003 00:00:00	3
600	07/26/2003 00:00:00	07/28/2003 00:00:00	3


Best Regards,
Harry

Re: How to combine rows that meet criteria in SQL [message #323078 is a reply to message #322441] Tue, 27 May 2008 04:33 Go to previous message
aegcp
Messages: 8
Registered: May 2008
Junior Member
Thank you guys.
Previous Topic: To roll back the updated records
Next Topic: Count characters
Goto Forum:
  


Current Time: Fri Dec 09 23:02:27 CST 2016

Total time taken to generate the page: 0.08857 seconds