Home » SQL & PL/SQL » SQL & PL/SQL » Data Chucking based on Timewindow
Data Chucking based on Timewindow [message #647723] |
Sat, 06 February 2016 06:31 |
|
Hi Experts,
I have a requirement where I need to divide source data into time window and count the number of records falling into that window to categorise accordingly
Example:
My source data
ID REP_ID Datetime
1 101 01/01/2016 23.01.01
2 101 01/01/2016 23.01.02
3 102 01/01/2016 23.01.40
4 103 01/01/2016 23.02.04
5 104 01/01/2016 23.02.05
6 105 01/01/2016 23.02.06
7 106 01/01/2016 23.04.00
8 107 01/01/2016 23.04.10
9 108 01/01/2016 23.05.09
10 109 01/01/2016 23.05.20
CREATE table time_window (
ID number,
REP_ID number,
Datetime timestamp
);
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,101,TO_DATE('01/01/2016 23:01:01','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,101,TO_DATE('01/01/2016 23:01:02','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,102,TO_DATE('01/01/2016 23:01:40','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,103,TO_DATE('01/01/2016 23:02:04','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,104,TO_DATE('01/01/2016 23:02:05','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,105,TO_DATE('01/01/2016 23:02:06','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,106,TO_DATE('01/01/2016 23:04:00','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,107,TO_DATE('01/01/2016 23:04:10','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,108,TO_DATE('01/01/2016 23:05:09','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,109,TO_DATE('01/01/2016 23:05:20','DD/MM/YYYY HH24:MI:SS'));
Commit;
User will enter window time let's say they had given 2 minutes
If within 2 minutes window the count of unique REP_IDs is 3 then I have to flag it as 12.1 if its more than 3 then 12.2
Result out put expected is
ID REP_ID Datetime Flag
1 101 01/01/2016 23.01.01 12.2
2 101 01/01/2016 23.01.02 12.2
3 102 01/01/2016 23.01.40 12.2
4 103 01/01/2016 23.02.04 12.2
5 104 01/01/2016 23.02.05 12.2
6 105 01/01/2016 23.02.06 12.2
7 106 01/01/2016 23.04.00 12.1
8 107 01/01/2016 23.04.10 12.1
9 108 01/01/2016 23.05.09 12.1
I had tried few different ways but nothing really working out appreciates if anyone can help get this solution.
Thank you
|
|
|
|
|
Re: Data Chucking based on Timewindow [message #647738 is a reply to message #647725] |
Sat, 06 February 2016 12:43 |
|
i had used below function to add the time window - but that really did not work
TO_CHAR(POWER_OUTAGE_END_TIME + <TIME_WINDOW_SEC>/(24*60),'mm/dd/yyyy HH:SS')
- its just adding that time window to my data field where what i want is multiple windows
Example:
as soon as i read my first record the time window need to be appended and i need to collect all the records between first record time and first appended record time - this i can categorise as first window
second window should start from the point the append time is equal to source data datetime field - so the start time for second window would be first appended record time and end time would be first appended record time + time window
....
This need to be carried out through the entire data set.
kind of loop.
I am not sure if i had put this clearly or not. Please let me know if you need any further details
Thank you
|
|
|
|
|
Re: Data Chucking based on Timewindow [message #647743 is a reply to message #647723] |
Sat, 06 February 2016 16:33 |
|
Barbara Boehmer
Messages: 9077 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
The following uses DATE data type in the table and expresses the interval in seconds. I have broken the query down one step at a time for easier understanding, but all you need is the final query at the bottom.
-- test table:
SCOTT@orcl> DESC time_window
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NUMBER
REP_ID NUMBER
DATETIME DATE
-- test data:
SCOTT@orcl> SELECT * FROM time_window ORDER BY datetime
2 /
ID REP_ID DATETIME
---------- ---------- ---------------
1 101 Fri 01-Jan-2016
2 101 Fri 01-Jan-2016
3 102 Fri 01-Jan-2016
4 103 Fri 01-Jan-2016
5 104 Fri 01-Jan-2016
6 105 Fri 01-Jan-2016
7 106 Fri 01-Jan-2016
8 107 Fri 01-Jan-2016
9 108 Fri 01-Jan-2016
10 109 Fri 01-Jan-2016
10 rows selected.
-- interval in seconds:
SCOTT@orcl> VARIABLE p_seconds NUMBER
SCOTT@orcl> EXEC :p_seconds := 120
PL/SQL procedure successfully completed.
-- time windows:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
2 FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds) windows
3 FROM time_window tw1,
4 (SELECT MIN (datetime) mindatetime
5 FROM time_window) tw2
6 ORDER BY datetime
7 /
ID REP_ID DATETIME WINDOWS
---------- ---------- --------------- ----------
1 101 Fri 01-Jan-2016 0
2 101 Fri 01-Jan-2016 0
3 102 Fri 01-Jan-2016 0
4 103 Fri 01-Jan-2016 0
5 104 Fri 01-Jan-2016 0
6 105 Fri 01-Jan-2016 0
7 106 Fri 01-Jan-2016 1
8 107 Fri 01-Jan-2016 1
9 108 Fri 01-Jan-2016 2
10 109 Fri 01-Jan-2016 2
10 rows selected.
-- count per window:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
2 COUNT (*) OVER
3 (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds)) window_count
4 FROM time_window tw1,
5 (SELECT MIN (datetime) mindatetime
6 FROM time_window) tw2
7 ORDER BY datetime
8 /
ID REP_ID DATETIME WINDOW_COUNT
---------- ---------- --------------- ------------
1 101 Fri 01-Jan-2016 6
2 101 Fri 01-Jan-2016 6
3 102 Fri 01-Jan-2016 6
4 103 Fri 01-Jan-2016 6
5 104 Fri 01-Jan-2016 6
6 105 Fri 01-Jan-2016 6
7 106 Fri 01-Jan-2016 2
8 107 Fri 01-Jan-2016 2
9 108 Fri 01-Jan-2016 2
10 109 Fri 01-Jan-2016 2
10 rows selected.
-- flags:
SCOTT@orcl> SELECT id, rep_id, datetime,
2 CASE WHEN (COUNT (*) OVER
3 (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds))) > 3
4 THEN 12.2
5 ELSE 12.1
6 END flag
7 FROM time_window tw1,
8 (SELECT MIN (datetime) mindatetime
9 FROM time_window) tw2
10 ORDER BY datetime
11 /
ID REP_ID DATETIME FLAG
---------- ---------- --------------- ----------
1 101 Fri 01-Jan-2016 12.2
2 101 Fri 01-Jan-2016 12.2
3 102 Fri 01-Jan-2016 12.2
4 103 Fri 01-Jan-2016 12.2
5 104 Fri 01-Jan-2016 12.2
6 105 Fri 01-Jan-2016 12.2
7 106 Fri 01-Jan-2016 12.1
8 107 Fri 01-Jan-2016 12.1
9 108 Fri 01-Jan-2016 12.1
10 109 Fri 01-Jan-2016 12.1
10 rows selected.
|
|
|
|
Re: Data Chucking based on Timewindow [message #647761 is a reply to message #647744] |
Sun, 07 February 2016 03:08 |
|
Hi Barbara Boehmer,
Below is the query you shared showing the time windows -
SELECT id, rep_id, datetime,
CASE WHEN (COUNT (*) OVER
(PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / 120))) > 3
THEN 12.2
ELSE 12.1
END flag,
COUNT (*) OVER
(PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / 120)) count_w,
FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / 120) time_w
FROM time_window tw1,
(SELECT MIN (datetime) mindatetime
FROM time_window) tw2
ORDER BY datetime
/
ID REP_ID DATETIME FLAG COUNT_W TIME_W
1 101 01-JAN-16 12.2 6 0
2 101 01-JAN-16 12.2 6 0
3 102 01-JAN-16 12.2 6 0
4 103 01-JAN-16 12.2 6 0
5 104 01-JAN-16 12.2 6 0
6 105 01-JAN-16 12.2 6 0
7 106 01-JAN-16 12.1 2 1
8 107 01-JAN-16 12.1 2 1
9 108 01-JAN-16 12.1 3 2
10 109 01-JAN-16 12.1 3 2
11 109 01-JAN-16 12.1 3 2
what I had noticed is first time window (Time_w = 0) is with in 2 minutes (120 seconds), but I believe the second and following are not getting categorised properly.
what I mean is the second window should add 120 second to the end datatime of first window (where first window end datetime is already added 120 seconds to minimum date of a day)
So the final result set should be
Time_w
0 for first window records means first part of 120 seconds
1 for second window records means second part of 120 seconds
2 for third window records means third part of 120 seconds
highly appreciate your inputs on this....
|
|
|
Re: Data Chucking based on Timewindow [message #647776 is a reply to message #647761] |
Sun, 07 February 2016 21:15 |
|
Barbara Boehmer
Messages: 9077 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
I just realized that when I posted before, I neglected to either use to_char or set the nls_date_format to display the time, so I have reposted everything below. Your original insert statements did not match your posted data and your new posted results do not match your posted data or new query, in terms of number of rows or columns. In the example below, the 2-minute (120-second) intervals are from 23.01.01 to 23.03, 23.03.01 to 23.05, and 23.05.01 to 23.07. I have added dashed lines to the final query results to show the separation of the intervals. If you are getting something different, then please post actual copy and paste of select of starting data and select of actual query and result. If these are not what you expect as two-minute intervals, then please explain.
SCOTT@orcl> -- test table:
SCOTT@orcl> DESC time_window
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NUMBER
REP_ID NUMBER
DATETIME DATE
SCOTT@orcl> -- test data:
SCOTT@orcl> ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24.MI.SS'
2 /
Session altered.
SCOTT@orcl> SELECT * FROM time_window ORDER BY datetime
2 /
ID REP_ID DATETIME
---------- ---------- --------------------
1 101 01-JAN-2016 23.01.01
2 101 01-JAN-2016 23.01.02
3 102 01-JAN-2016 23.01.40
4 103 01-JAN-2016 23.02.04
5 104 01-JAN-2016 23.02.05
6 105 01-JAN-2016 23.02.06
7 106 01-JAN-2016 23.04.00
8 107 01-JAN-2016 23.04.10
9 108 01-JAN-2016 23.05.09
10 109 01-JAN-2016 23.05.20
10 rows selected.
SCOTT@orcl> -- interval in seconds:
SCOTT@orcl> VARIABLE p_seconds NUMBER
SCOTT@orcl> EXEC :p_seconds := 120
PL/SQL procedure successfully completed.
SCOTT@orcl> -- time windows:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
2 FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds) windows
3 FROM time_window tw1,
4 (SELECT MIN (datetime) mindatetime
5 FROM time_window) tw2
6 ORDER BY datetime
7 /
ID REP_ID DATETIME WINDOWS
---------- ---------- -------------------- ----------
1 101 01-JAN-2016 23.01.01 0
2 101 01-JAN-2016 23.01.02 0
3 102 01-JAN-2016 23.01.40 0
4 103 01-JAN-2016 23.02.04 0
5 104 01-JAN-2016 23.02.05 0
6 105 01-JAN-2016 23.02.06 0
7 106 01-JAN-2016 23.04.00 1
8 107 01-JAN-2016 23.04.10 1
9 108 01-JAN-2016 23.05.09 2
10 109 01-JAN-2016 23.05.20 2
10 rows selected.
SCOTT@orcl> -- count per window:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
2 COUNT (*) OVER
3 (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds)) window_count
4 FROM time_window tw1,
5 (SELECT MIN (datetime) mindatetime
6 FROM time_window) tw2
7 ORDER BY datetime
8 /
ID REP_ID DATETIME WINDOW_COUNT
---------- ---------- -------------------- ------------
1 101 01-JAN-2016 23.01.01 6
2 101 01-JAN-2016 23.01.02 6
3 102 01-JAN-2016 23.01.40 6
4 103 01-JAN-2016 23.02.04 6
5 104 01-JAN-2016 23.02.05 6
6 105 01-JAN-2016 23.02.06 6
7 106 01-JAN-2016 23.04.00 2
8 107 01-JAN-2016 23.04.10 2
9 108 01-JAN-2016 23.05.09 2
10 109 01-JAN-2016 23.05.20 2
10 rows selected.
SCOTT@orcl> -- flags:
SCOTT@orcl> SELECT id, rep_id, datetime,
2 CASE WHEN (COUNT (*) OVER
3 (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds))) > 3
4 THEN 12.2
5 ELSE 12.1
6 END flag
7 FROM time_window tw1,
8 (SELECT MIN (datetime) mindatetime
9 FROM time_window) tw2
10 ORDER BY datetime
11 /
ID REP_ID DATETIME FLAG
---------- ---------- -------------------- ----------
1 101 01-JAN-2016 23.01.01 12.2
2 101 01-JAN-2016 23.01.02 12.2
3 102 01-JAN-2016 23.01.40 12.2
4 103 01-JAN-2016 23.02.04 12.2
5 104 01-JAN-2016 23.02.05 12.2
6 105 01-JAN-2016 23.02.06 12.2
---------------------------------------------------------
7 106 01-JAN-2016 23.04.00 12.1
8 107 01-JAN-2016 23.04.10 12.1
---------------------------------------------------------
9 108 01-JAN-2016 23.05.09 12.1
10 109 01-JAN-2016 23.05.20 12.1
10 rows selected.
|
|
|
|
Goto Forum:
Current Time: Thu Mar 28 11:54:05 CDT 2024
|