Home » SQL & PL/SQL » SQL & PL/SQL » grouping rows based on values
grouping rows based on values [message #384202] Mon, 02 February 2009 14:01 Go to next message
sherifbasha
Messages: 11
Registered: February 2007
Location: Chennai,India
Junior Member

hi can anybody please help i have an unusual requirement.
i am trying to get a result from a table given below
seq               month            id        src
1              28-OCT-09           384         0
2              08-DEC-09           388         0
3              30-NOV-10           389         0
4              01-JAN-61           396         1
5              01-JAN-62           397         0
6              30-NOV-04           610         0
8              30-NOV-03           611         1

I am trying to get an output based on src and seq column.
The output should be 4 rows in this case i mean grouping first 3 rows(column src value 0), then 1 row(column src value 1) and so on
output
28-oct-09 to 30-NOV-10             /**1st and 3rd row in src*/
01-JAN-61                            /**4 th row  in 1 
01-JAN-62 to 30-NOV-04              /**5th to 6 th value
30-NOV-03                           /** 7 th row*?

the src column values could be in any order of 0 and 1 but identified by seq.
Is it possible to do in query.
Re: grouping rows based on values [message #384208 is a reply to message #384202] Mon, 02 February 2009 14:51 Go to previous messageGo to next message
Michel Cadot
Messages: 64152
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Post a Test case: create table and insert statements along with the result you want with these data.
Also always post your Oracle version (4 decimals).

In the end, post what you already tried.

Regards
Michel
Re: grouping rows based on values [message #384210 is a reply to message #384202] Mon, 02 February 2009 18:44 Go to previous messageGo to next message
st33chen
Messages: 11
Registered: January 2009
Junior Member
hi,

how about :

select decode(minmon, maxmon, to_char(minmon,'DD-MON-YY'), to_char(minmon,'DD-MON-YY') || ' to ' || to_char(maxmon,'DD-MON-YY')) outputfield
from (
select min(month) minmon, max(month) maxmon
from (select * from table1 order by seq)
group by src
)

(please try by yourself to condense this SQL statement to use fewer SELECT)
Re: grouping rows based on values [message #384218 is a reply to message #384210] Mon, 02 February 2009 21:50 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
Seems to me this problem has at least one solution with use of Oracle Analytics. Do some googling if you do not know analytics. Without an understanding of them, what you are about to see you will not comprehend. That said, here is some sql that shows you what you need to do.

Here is some data:

SQL> select *
  2  from temp1
  3  /

         A          B
---------- ----------
         1          0
         2          0
         3          0
         4          1
         5          0
         6          0
         7          1

7 rows selected.


Using analytics we can determine when a change in your grouping column occurs:

SQL> select a,b
  2        ,decode(b,nvl(lag(b) over (order by a),b),null,1) change
  3  from temp1
  4  order by a
  5  /

         A          B C
---------- ---------- -
         1          0
         2          0
         3          0
         4          1 1
         5          0 1
         6          0
         7          1 1

7 rows selected.


Given that a change is indicated by a 1 and nochange is null, we can use the analytic sum to create a group id that puts each row into a group based on when the change occurs:

SQL> select a,b
  2        ,nvl(sum(change) over (order by a),0)+1 rowset
  3  from (
  4          select a,b
  5                ,decode(b,nvl(lag(b) over (order by a),b),null,1) change
  6          from temp1
  7          order by a
  8       )
  9  /

         A          B     ROWSET
---------- ---------- ----------
         1          0          1
         2          0          1
         3          0          1
         4          1          2
         5          0          3
         6          0          3
         7          1          4

7 rows selected.


Now use analytic min/max to get the first and last values of our other column (guess you could have used first/last too)

SQL> select a,b
  2        ,min(a) over(partition by rowset order by a) min_a
  3        ,max(a) over(partition by rowset order by a ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) max_a
  4  from (
  5         select a,b
  6               ,nvl(sum(change) over (order by a),0)+1 rowset
  7         from (
  8                 select a,b
  9                       ,decode(b,nvl(lag(b) over (order by a),b),null,1) change
 10                 from temp1
 11                 order by a
 12              )
 13       )
 14  /

         A          B      MIN_A      MAX_A
---------- ---------- ---------- ----------
         1          0          1          3
         2          0          1          3
         3          0          1          3
         4          1          4          4
         5          0          5          6
         6          0          5          6
         7          1          7          7

7 rows selected.


This may not be the most efficient method of doing this. Since we have to do three different analytics, we have to pass the data three times with three different sorts. It might pay to have a go with PIPELINED FUNCTIONS as an alternative. But this gives you a single select if that is what you want.

Good luck, Kevin
Re: grouping rows based on values [message #384240 is a reply to message #384202] Tue, 03 February 2009 00:22 Go to previous messageGo to next message
idris.ali
Messages: 34
Registered: June 2008
Location: Hyderabad
Member


SQL> CREATE TABLE tempABC  AS SELECT ROWNUM a,0 b FROM scott.emp WHERE ROWNUM < 8
  2  /

Table created.

SQL> UPDATE tempABC SET b=1 WHERE a IN (4,7)
  2  /

2 rows updated.

SQL> SELECT a,b, Sum(b) over (ORDER BY a) c FROM tempABC
  2  /

         A          B          C
---------- ---------- ----------
         1          0          0
         2          0          0
         3          0          0
         4          1          1
         5          0          1
         6          0          1
         7          1          2

7 rows selected.


From the select query , the combination of B and C
is always uniqe for a set of consecutive zeros or ones
.So we may utilize this to come up with the final query

SELECT   a,
         b,
         MIN(a)
           OVER(PARTITION BY c ORDER BY a)                                                  min_a,
         MAX(a)
           OVER(PARTITION BY c ORDER BY a ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) max_a
FROM     (SELECT   a,
                   b,
                   b
                   ||SUM(b)
                       OVER(ORDER BY a) c
          FROM     tempabc
          ORDER BY a) tmp
ORDER BY a 

         A          B      MIN_A      MAX_A
---------- ---------- ---------- ----------
         1          0          1          3
         2          0          1          3
         3          0          1          3
         4          1          4          4
         5          0          5          6
         6          0          5          6
         7          1          7          7

7 rows selected.

Re: grouping rows based on values [message #384259 is a reply to message #384202] Tue, 03 February 2009 00:52 Go to previous messageGo to next message
Michel Cadot
Messages: 64152
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I think this kind of problem as follow:
1/ determine the order of the rows to make the groups
2/ determine the condition in the row that limits the groups
3/ number each group
4/ apply the required function(s) on each group

If we take idris.ali's example and find the first and last A for each group:
1/ order is by A
2/ condition to change group is when B changes
3/ when B change put the row number, this will be our group number
3b/ propagate this group number in all rows of the group
4/ functions on each group are min and max on A
4b/ format the result in pretty way

Steps 1 to 3, order, find group limit (first element) and number it:
SQL> SELECT a, b,
  2         CASE WHEN 
  3                LAG(b) OVER (ORDER BY a) IS NULL OR
  4                LAG(b) OVER (ORDER BY a) != b
  5              THEN ROW_NUMBER() OVER (ORDER BY a)
  6         END first_element_of_group
  7  FROM tempABC
  8  /
         A          B FIRST_ELEMENT_OF_GROUP
---------- ---------- ----------------------
         1          0                      1
         2          0
         3          0
         4          1                      4
         5          0                      5
         6          0
         7          1                      7

7 rows selected.

(I could use NVL instead of repeating LAG function but it would less clear.)

Step 3b, propagate group number in each row: use MAX analytical function with default window that is all rows before the current one and including it:
SQL> WITH
  2    step1to3 AS (
  3      SELECT a, b,
  4             CASE WHEN 
  5                    LAG(b) OVER (ORDER BY a) IS NULL OR
  6                    LAG(b) OVER (ORDER BY a) != b
  7                  THEN ROW_NUMBER() OVER (ORDER BY a)
  8             END first_element_of_group
  9      FROM tempABC
 10    )
 11  SELECT a, b, 
 12         MAX(first_element_of_group) OVER (ORDER BY a) group_number
 13  FROM step1to3
 14  /
         A          B GROUP_NUMBER
---------- ---------- ------------
         1          0            1
         2          0            1
         3          0            1
         4          1            4
         5          0            5
         6          0            5
         7          1            7

7 rows selected.


Step 4, apply the appropriate function for the requirements:
SQL> WITH
  2    step1to3 AS (
  3      SELECT a, b,
  4             CASE WHEN 
  5                    LAG(b) OVER (ORDER BY a) IS NULL OR
  6                    LAG(b) OVER (ORDER BY a) != b
  7                  THEN ROW_NUMBER() OVER (ORDER BY a)
  8             END first_element_of_group
  9      FROM tempABC
 10    ),
 11    step3b as (
 12      SELECT a, b, 
 13             MAX(first_element_of_group) OVER (ORDER BY a) group_number
 14      FROM step1to3
 15    )
 16  SELECT group_number, MIN(a) first_row, MAX(a) last_row
 17  FROM step3b
 18  GROUP BY group_number
 19  ORDER BY 1
 20  /
GROUP_NUMBER  FIRST_ROW   LAST_ROW
------------ ---------- ----------
           1          1          3
           4          4          4
           5          5          6
           7          7          7

4 rows selected.


Step 4b, format in pretty way:
SQL> WITH
  2    step1to3 AS (
  3      SELECT a, b,
  4             CASE WHEN 
  5                    LAG(b) OVER (ORDER BY a) IS NULL OR
  6                    LAG(b) OVER (ORDER BY a) != b
  7                  THEN ROW_NUMBER() OVER (ORDER BY a)
  8             END first_element_of_group
  9      FROM tempABC
 10    ),
 11    step3b as (
 12      SELECT a, b, 
 13             MAX(first_element_of_group) OVER (ORDER BY a) group_number
 14      FROM step1to3
 15    )
 16  SELECT MIN(a)||DECODE(MIN(a),MAX(a),'','->'||MAX(a)) range
 17  FROM step3b
 18  GROUP BY group_number
 19  ORDER BY MIN(a)
 20  /
RANGE
---------------------------------------------------------------------------
1->3
4
5->6
7

4 rows selected.

Regards
Michel

[Updated on: Tue, 03 February 2009 00:53]

Report message to a moderator

Re: grouping rows based on values [message #384390 is a reply to message #384259] Tue, 03 February 2009 10:21 Go to previous messageGo to next message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
Quote:

(I could use NVL instead of repeating LAG function but it would less clear.)


It could be re-written in this way as well but your format is more clear what you are trying to do. Assuming column "b" is number.
LAG(b,1,(b-1)) OVER (ORDER BY a) != b

Another minor addition. Instead of a Row_number you could use rownum. Possibly we could save another milli or micro second. Smile

Regards

Raj
Re: grouping rows based on values [message #384394 is a reply to message #384390] Tue, 03 February 2009 10:50 Go to previous messageGo to next message
Michel Cadot
Messages: 64152
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I agree with your first remark although at first read we could ask "why (b-1)?".
But for the second one, rownum is not completly safe (I remember having this discussion with Tom Kyte a couple of years ago... I was in your side, he said to me something like "if you really want to use rownum") and it does not really add time as there is the same order for LAG and ROW_NUMBER, so the result set is already sorted, and so there is no real operation.

Regards
Michel
Re: grouping rows based on values [message #384398 is a reply to message #384394] Tue, 03 February 2009 11:04 Go to previous messageGo to next message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
Quote:

I was in your side, he said to me something like "if you really want to use rownum") and it does not really add time as there is the same order for LAG and ROW_NUMBER, so the result set is already sorted, and so there is no real operation.


Exactly this is the reason I mentioned we could save a milli second or a micro second because the ordering is the same for these analytics function (Lag and Row_number). Since rownum being a pseudo column I would expect it to be calculated internally irrespective of the fact we select it, but with the row_number being an analytic function oracle might need to make an addition function call (again I could be wrong) because I am not sure how it is been optimised internally.

But the question of using rownum is not completely safe I am not sure what you mean by that ?

Also, I vaguely remember the original post in his asktom site where his very first solution for sliding window technique uses rownum but the article posted in oracle website uses row_number. I have used his solution many times (using rownum) and never ran into any issues. But I have not tried the same using row_number to compare the performance benefits (I doubt there will be a significant difference).

Regards

Raj

[Edit: ] Spelling mistakes

[Updated on: Tue, 03 February 2009 11:06]

Report message to a moderator

icon14.gif  Re: grouping rows based on values [message #386473 is a reply to message #384202] Sun, 15 February 2009 08:44 Go to previous message
sherifbasha
Messages: 11
Registered: February 2007
Location: Chennai,India
Junior Member

Thank you very much all experts, your solutions are awesome .Very much delighted to be in this forum.This has solved many of my problems. Surprised
Previous Topic: how to find TOP 5 employee record
Next Topic: alternatives for utlmatch?
Goto Forum:
  


Current Time: Sat Dec 10 01:30:35 CST 2016

Total time taken to generate the page: 0.15917 seconds