Home » SQL & PL/SQL » SQL & PL/SQL » Analytical grouping of data (Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production)
Analytical grouping of data [message #651261] |
Fri, 13 May 2016 12:08 |
|
vduhan
Messages: 2 Registered: May 2016
|
Junior Member |
|
|
Hi Gurus,
I'm trying to group data based on when the value changes. There are 3 columns (mydate, key and group). Requirement is to find out how many consecutive days the value has been stale for a particular key before it changes.
Sample data:
WITH tmp AS (
SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual
ORDER BY key,mydate, value)
SELECT * FROM tmp;
The data is already ordered based on key, mydate and value. The group changes (increments) whenever there is a change in the key/value pair.
In the above example, for "key" = 158353, the value (0.01) is stale from date 02-May to 10-May, so group for that should be, say G1.
On 11-May for "key" = 158353, the value changes from (0.01) to (0.02) and remains stale from date 11-May to 12-May, so group for that should be, say G2.
On 13-May for "key" = 158353, the value again changes from (0.02) back to (0.01) and remains stale only for a day, so group for that should be, say G3... and so on.
I was able to identify when exactly the key/value pair changes using LAG/LEAD functions but having trouble naming/assigning a group of all the items and assign proper group buckets to the same.
Expected output:
MYDATE | KEY | VALUE | GROUP
02/05/2016 | 158353 | 0.01 | G1
03/05/2016 | 158353 | 0.01 | G1
04/05/2016 | 158353 | 0.01 | G1
05/05/2016 | 158353 | 0.01 | G1
06/05/2016 | 158353 | 0.01 | G1
09/05/2016 | 158353 | 0.01 | G1
10/05/2016 | 158353 | 0.01 | G1
11/05/2016 | 158353 | 0.02 | G2
12/05/2016 | 158353 | 0.02 | G2
13/05/2016 | 158353 | 0.01 | G3
02/05/2016 | 333333 | 1.7 | G4
03/05/2016 | 333333 | 1.7 | G4
04/05/2016 | 333333 | 2 | G5
05/05/2016 | 333333 | 2 | G5
06/05/2016 | 333333 | 2 | G5
09/05/2016 | 333333 | 2 | G5
10/05/2016 | 333333 | 1.85 | G6
11/05/2016 | 333333 | 1.7 | G7
12/05/2016 | 333333 | 1.85 | G8
13/05/2016 | 333333 | 1.7 | G9
Really eager to see if there are any good ideas on how to implement this.
[Updated on: Fri, 13 May 2016 12:27] Report message to a moderator
|
|
|
Re: Analytical grouping of data [message #651262 is a reply to message #651261] |
Fri, 13 May 2016 12:30 |
|
Michel Cadot
Messages: 68644 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
Thanks for the test case.
Here are 2 solutions depending if you want to start group number at 1 at each key or have unique group numbers:
SQL> col grp format a3
SQL> break on key dup skip 1
SQL> WITH tmp AS (
2 SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
3 SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
4 SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
5 SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
6 SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
7 SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
8 SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
9 SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
10 SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
11 SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
12 SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
13 SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
14 SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
15 SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
16 SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
17 SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
18 SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
19 SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
20 SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
21 SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual
22 ),
23 data as (
24 select tmp.*,
25 case when lag(value) over (partition by key order by mydate) != value then 1 end nb
26 from tmp
27 )
28 select key, mydate, value,
29 'G'||to_char(count(nb) over (partition by key order by mydate)+1) grp
30 from data
31 ORDER BY key, mydate
32 /
KEY MYDATE VALUE GRP
---------- ----------- ---------- ---
158353 02-MAY-2016 .01 G1
158353 03-MAY-2016 .01 G1
158353 04-MAY-2016 .01 G1
158353 05-MAY-2016 .01 G1
158353 06-MAY-2016 .01 G1
158353 09-MAY-2016 .01 G1
158353 10-MAY-2016 .01 G1
158353 11-MAY-2016 .02 G2
158353 12-MAY-2016 .02 G2
158353 13-MAY-2016 .01 G3
333333 02-MAY-2016 1.7 G1
333333 03-MAY-2016 1.7 G1
333333 04-MAY-2016 2 G2
333333 05-MAY-2016 2 G2
333333 06-MAY-2016 2 G2
333333 09-MAY-2016 2 G2
333333 10-MAY-2016 1.85 G3
333333 11-MAY-2016 1.7 G4
333333 12-MAY-2016 1.85 G5
333333 13-MAY-2016 1.7 G6
20 rows selected.
SQL> WITH tmp AS (
2 SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
3 SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
4 SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
5 SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
6 SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
7 SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
8 SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
9 SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
10 SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
11 SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
12 SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
13 SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
14 SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
15 SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
16 SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
17 SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
18 SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
19 SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
20 SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
21 SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual
22 ),
23 data as (
24 select tmp.*,
25 case
26 when lag(key) over (order by key, mydate) != key
27 or lag(value) over (order by key, mydate) != value
28 then 1
29 end nb
30 from tmp
31 )
32 select key, mydate, value,
33 'G'||to_char(count(nb) over (order by key, mydate)+1) grp
34 from data
35 ORDER BY key, mydate
36 /
KEY MYDATE VALUE GRP
---------- ----------- ---------- ---
158353 02-MAY-2016 .01 G1
158353 03-MAY-2016 .01 G1
158353 04-MAY-2016 .01 G1
158353 05-MAY-2016 .01 G1
158353 06-MAY-2016 .01 G1
158353 09-MAY-2016 .01 G1
158353 10-MAY-2016 .01 G1
158353 11-MAY-2016 .02 G2
158353 12-MAY-2016 .02 G2
158353 13-MAY-2016 .01 G3
333333 02-MAY-2016 1.7 G4
333333 03-MAY-2016 1.7 G4
333333 04-MAY-2016 2 G5
333333 05-MAY-2016 2 G5
333333 06-MAY-2016 2 G5
333333 09-MAY-2016 2 G5
333333 10-MAY-2016 1.85 G6
333333 11-MAY-2016 1.7 G7
333333 12-MAY-2016 1.85 G8
333333 13-MAY-2016 1.7 G9
20 rows selected.
|
|
|
|
|
Re: Analytical grouping of data [message #651265 is a reply to message #651264] |
Fri, 13 May 2016 13:05 |
|
Michel Cadot
Messages: 68644 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
You can avoid the "+1" in the last formula using all the parameters of LAG function (assuming key and value are not NULL otherwise the use of NVL function should be added, this is also true for the previous queries):
SQL> WITH tmp AS (
2 SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
3 SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
4 SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
5 SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
6 SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
7 SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
8 SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
9 SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
10 SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.02 AS value from dual UNION ALL
11 SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 158353 AS key, 0.01 AS value from dual UNION ALL
12 SELECT to_date('02MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
13 SELECT to_date('03MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
14 SELECT to_date('04MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
15 SELECT to_date('05MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
16 SELECT to_date('06MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
17 SELECT to_date('09MAY2016','DDMONYYYY') as mydate, 333333 AS key, 2.00 AS value from dual UNION ALL
18 SELECT to_date('10MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
19 SELECT to_date('11MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual UNION ALL
20 SELECT to_date('12MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.85 AS value from dual UNION ALL
21 SELECT to_date('13MAY2016','DDMONYYYY') as mydate, 333333 AS key, 1.70 AS value from dual
22 ),
23 data as (
24 select tmp.*,
25 case
26 when lag(key,1,key-1) over (order by key, mydate) != key
27 or lag(value,1,value-1) over (order by key, mydate) != value
28 then 1
29 end flag
30 from tmp
31 )
32 select key, mydate, value,
33 'G'||to_char(count(flag) over (order by key, mydate)) grp
34 from data
35 ORDER BY key, mydate
36 /
KEY MYDATE VALUE GRP
---------- ----------- ---------- ---
158353 02-MAY-2016 .01 G1
158353 03-MAY-2016 .01 G1
158353 04-MAY-2016 .01 G1
158353 05-MAY-2016 .01 G1
158353 06-MAY-2016 .01 G1
158353 09-MAY-2016 .01 G1
158353 10-MAY-2016 .01 G1
158353 11-MAY-2016 .02 G2
158353 12-MAY-2016 .02 G2
158353 13-MAY-2016 .01 G3
333333 02-MAY-2016 1.7 G4
333333 03-MAY-2016 1.7 G4
333333 04-MAY-2016 2 G5
333333 05-MAY-2016 2 G5
333333 06-MAY-2016 2 G5
333333 09-MAY-2016 2 G5
333333 10-MAY-2016 1.85 G6
333333 11-MAY-2016 1.7 G7
333333 12-MAY-2016 1.85 G8
333333 13-MAY-2016 1.7 G9
|
|
|
Goto Forum:
Current Time: Wed Apr 24 05:21:23 CDT 2024
|