Home » SQL & PL/SQL » SQL & PL/SQL » Data Chucking based on Timewindow
Data Chucking based on Timewindow [message #647723] Sat, 06 February 2016 06:31 Go to next message
razasheikh
Messages: 11
Registered: February 2016
Location: sdfdf
Junior Member

Hi Experts,
I have a requirement where I need to divide source data into time window and count the number of records falling into that window to categorise accordingly
Example:
My source data
ID REP_ID Datetime
1 101 01/01/2016 23.01.01
2 101 01/01/2016 23.01.02
3 102 01/01/2016 23.01.40
4 103 01/01/2016 23.02.04
5 104 01/01/2016 23.02.05
6 105 01/01/2016 23.02.06
7 106 01/01/2016 23.04.00
8 107 01/01/2016 23.04.10
9 108 01/01/2016 23.05.09
10 109 01/01/2016 23.05.20

CREATE table time_window (
ID number,
REP_ID number,
Datetime timestamp
);

INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,101,TO_DATE('01/01/2016 23:01:01','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,101,TO_DATE('01/01/2016 23:01:02','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,102,TO_DATE('01/01/2016 23:01:40','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,103,TO_DATE('01/01/2016 23:02:04','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,104,TO_DATE('01/01/2016 23:02:05','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,105,TO_DATE('01/01/2016 23:02:06','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,106,TO_DATE('01/01/2016 23:04:00','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,107,TO_DATE('01/01/2016 23:04:10','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,108,TO_DATE('01/01/2016 23:05:09','DD/MM/YYYY HH24:MI:SS'));
INSERT INTO TIME_WINDOW (ID,REP,DATETIME) VALUES (1,109,TO_DATE('01/01/2016 23:05:20','DD/MM/YYYY HH24:MI:SS'));

Commit;
User will enter window time let's say they had given 2 minutes
If within 2 minutes window the count of unique REP_IDs is 3 then I have to flag it as 12.1 if its more than 3 then 12.2
Result out put expected is
ID REP_ID Datetime Flag
1 101 01/01/2016 23.01.01 12.2
2 101 01/01/2016 23.01.02 12.2
3 102 01/01/2016 23.01.40 12.2
4 103 01/01/2016 23.02.04 12.2
5 104 01/01/2016 23.02.05 12.2
6 105 01/01/2016 23.02.06 12.2
7 106 01/01/2016 23.04.00 12.1
8 107 01/01/2016 23.04.10 12.1
9 108 01/01/2016 23.05.09 12.1

I had tried few different ways but nothing really working out appreciates if anyone can help get this solution.
Thank you
Re: Data Chucking based on Timewindow [message #647724 is a reply to message #647723] Sat, 06 February 2016 06:40 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
Please read our OraFAQ Forum Guide and How to use [code] tags and make your code easier to read You have been asked to do this before.


Quote:
I had tried few different ways but nothing really working
What code have you tried so far?

[Updated on: Sat, 06 February 2016 06:40]

Report message to a moderator

Re: Data Chucking based on Timewindow [message #647725 is a reply to message #647723] Sat, 06 February 2016 06:57 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
Result out put expected is


Explain it, line by line.
And FORMAT your post

Re: Data Chucking based on Timewindow [message #647738 is a reply to message #647725] Sat, 06 February 2016 12:43 Go to previous messageGo to next message
razasheikh
Messages: 11
Registered: February 2016
Location: sdfdf
Junior Member

i had used below function to add the time window - but that really did not work


TO_CHAR(POWER_OUTAGE_END_TIME + <TIME_WINDOW_SEC>/(24*60),'mm/dd/yyyy HH:SS')

- its just adding that time window to my data field where what i want is multiple windows

Example:

as soon as i read my first record the time window need to be appended and i need to collect all the records between first record time and first appended record time - this i can categorise as first window

second window should start from the point the append time is equal to source data datetime field - so the start time for second window would be first appended record time and end time would be first appended record time + time window

....

This need to be carried out through the entire data set.

kind of loop.

I am not sure if i had put this clearly or not. Please let me know if you need any further details

Thank you
Re: Data Chucking based on Timewindow [message #647739 is a reply to message #647738] Sat, 06 February 2016 12:51 Go to previous messageGo to next message
Michel Cadot
Messages: 68625
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
Please let me know if you need any further details


Michel Cadot wrote on Sat, 06 February 2016 13:57

Quote:
Result out put expected is


Explain it, line by line.


Re: Data Chucking based on Timewindow [message #647740 is a reply to message #647723] Sat, 06 February 2016 12:55 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
minor nit to pick.

Is TIMESTAMP datatype really required? Why is fractions of seconds needed? Why not use DATE datatype?
Why do you load TIMESTAMP with TO_DATE() which always produces ZERO fractions of seconds value?
Re: Data Chucking based on Timewindow [message #647743 is a reply to message #647723] Sat, 06 February 2016 16:33 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
The following uses DATE data type in the table and expresses the interval in seconds. I have broken the query down one step at a time for easier understanding, but all you need is the final query at the bottom.

-- test table:
SCOTT@orcl> DESC time_window
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER
 REP_ID                                             NUMBER
 DATETIME                                           DATE


-- test data:
SCOTT@orcl> SELECT * FROM time_window ORDER BY datetime
  2  /

        ID     REP_ID DATETIME
---------- ---------- ---------------
         1        101 Fri 01-Jan-2016
         2        101 Fri 01-Jan-2016
         3        102 Fri 01-Jan-2016
         4        103 Fri 01-Jan-2016
         5        104 Fri 01-Jan-2016
         6        105 Fri 01-Jan-2016
         7        106 Fri 01-Jan-2016
         8        107 Fri 01-Jan-2016
         9        108 Fri 01-Jan-2016
        10        109 Fri 01-Jan-2016

10 rows selected.


-- interval in seconds:
SCOTT@orcl> VARIABLE p_seconds NUMBER
SCOTT@orcl> EXEC :p_seconds := 120

PL/SQL procedure successfully completed.


-- time windows:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
  2  	    FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds) windows
  3  FROM   time_window tw1,
  4  	    (SELECT MIN (datetime) mindatetime
  5  	     FROM   time_window) tw2
  6  ORDER  BY datetime
  7  /

        ID     REP_ID DATETIME           WINDOWS
---------- ---------- --------------- ----------
         1        101 Fri 01-Jan-2016          0
         2        101 Fri 01-Jan-2016          0
         3        102 Fri 01-Jan-2016          0
         4        103 Fri 01-Jan-2016          0
         5        104 Fri 01-Jan-2016          0
         6        105 Fri 01-Jan-2016          0
         7        106 Fri 01-Jan-2016          1
         8        107 Fri 01-Jan-2016          1
         9        108 Fri 01-Jan-2016          2
        10        109 Fri 01-Jan-2016          2

10 rows selected.


-- count per window:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
  2  	    COUNT (*) OVER
  3  	      (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds)) window_count
  4  FROM   time_window tw1,
  5  	    (SELECT MIN (datetime) mindatetime
  6  	     FROM   time_window) tw2
  7  ORDER  BY datetime
  8  /

        ID     REP_ID DATETIME        WINDOW_COUNT
---------- ---------- --------------- ------------
         1        101 Fri 01-Jan-2016            6
         2        101 Fri 01-Jan-2016            6
         3        102 Fri 01-Jan-2016            6
         4        103 Fri 01-Jan-2016            6
         5        104 Fri 01-Jan-2016            6
         6        105 Fri 01-Jan-2016            6
         7        106 Fri 01-Jan-2016            2
         8        107 Fri 01-Jan-2016            2
         9        108 Fri 01-Jan-2016            2
        10        109 Fri 01-Jan-2016            2

10 rows selected.


-- flags:
SCOTT@orcl> SELECT id, rep_id, datetime,
  2  	    CASE WHEN (COUNT (*) OVER
  3  			 (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds))) > 3
  4  		 THEN 12.2
  5  		 ELSE 12.1
  6  	    END flag
  7  FROM   time_window tw1,
  8  	    (SELECT MIN (datetime) mindatetime
  9  	     FROM   time_window) tw2
 10  ORDER  BY datetime
 11  /

        ID     REP_ID DATETIME              FLAG
---------- ---------- --------------- ----------
         1        101 Fri 01-Jan-2016       12.2
         2        101 Fri 01-Jan-2016       12.2
         3        102 Fri 01-Jan-2016       12.2
         4        103 Fri 01-Jan-2016       12.2
         5        104 Fri 01-Jan-2016       12.2
         6        105 Fri 01-Jan-2016       12.2
         7        106 Fri 01-Jan-2016       12.1
         8        107 Fri 01-Jan-2016       12.1
         9        108 Fri 01-Jan-2016       12.1
        10        109 Fri 01-Jan-2016       12.1

10 rows selected.

Re: Data Chucking based on Timewindow [message #647744 is a reply to message #647743] Sat, 06 February 2016 17:44 Go to previous messageGo to next message
razasheikh
Messages: 11
Registered: February 2016
Location: sdfdf
Junior Member

Thank you very much - much appreciated this is exactly what I wanted to achieve....
Re: Data Chucking based on Timewindow [message #647761 is a reply to message #647744] Sun, 07 February 2016 03:08 Go to previous messageGo to next message
razasheikh
Messages: 11
Registered: February 2016
Location: sdfdf
Junior Member

Hi Barbara Boehmer,

Below is the query you shared showing the time windows -

SELECT id, rep_id, datetime,
CASE WHEN (COUNT (*) OVER
   (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / 120))) > 3
    THEN 12.2
     ELSE 12.1
       END flag,
        COUNT (*) OVER
   (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / 120)) count_w,
      FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / 120) time_w
  FROM   time_window tw1,
       (SELECT MIN (datetime) mindatetime
        FROM   time_window) tw2
  ORDER  BY datetime
  /
 
ID	REP_ID	DATETIME	FLAG	COUNT_W	TIME_W
1	101	01-JAN-16	12.2	6	0
2	101	01-JAN-16	12.2	6	0
3	102	01-JAN-16	12.2	6	0
4	103	01-JAN-16	12.2	6	0
5	104	01-JAN-16	12.2	6	0
6	105	01-JAN-16	12.2	6	0
7	106	01-JAN-16	12.1	2	1
8	107	01-JAN-16	12.1	2	1
9	108	01-JAN-16	12.1	3	2
10	109	01-JAN-16	12.1	3	2
11	109	01-JAN-16	12.1	3	2
 




what I had noticed is first time window (Time_w = 0) is with in 2 minutes (120 seconds), but I believe the second and following are not getting categorised properly.

what I mean is the second window should add 120 second to the end datatime of first window (where first window end datetime is already added 120 seconds to minimum date of a day)

So the final result set should be

Time_w
0 for first window records means first part of 120 seconds
1 for second window records means second part of 120 seconds
2 for third window records means third part of 120 seconds

highly appreciate your inputs on this....
Re: Data Chucking based on Timewindow [message #647776 is a reply to message #647761] Sun, 07 February 2016 21:15 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
I just realized that when I posted before, I neglected to either use to_char or set the nls_date_format to display the time, so I have reposted everything below. Your original insert statements did not match your posted data and your new posted results do not match your posted data or new query, in terms of number of rows or columns. In the example below, the 2-minute (120-second) intervals are from 23.01.01 to 23.03, 23.03.01 to 23.05, and 23.05.01 to 23.07. I have added dashed lines to the final query results to show the separation of the intervals. If you are getting something different, then please post actual copy and paste of select of starting data and select of actual query and result. If these are not what you expect as two-minute intervals, then please explain.

SCOTT@orcl> -- test table:
SCOTT@orcl> DESC time_window
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                                 NUMBER
 REP_ID                                             NUMBER
 DATETIME                                           DATE

SCOTT@orcl> -- test data:
SCOTT@orcl> ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24.MI.SS'
  2  /

Session altered.

SCOTT@orcl> SELECT * FROM time_window ORDER BY datetime
  2  /

        ID     REP_ID DATETIME
---------- ---------- --------------------
         1        101 01-JAN-2016 23.01.01
         2        101 01-JAN-2016 23.01.02
         3        102 01-JAN-2016 23.01.40
         4        103 01-JAN-2016 23.02.04
         5        104 01-JAN-2016 23.02.05
         6        105 01-JAN-2016 23.02.06
         7        106 01-JAN-2016 23.04.00
         8        107 01-JAN-2016 23.04.10
         9        108 01-JAN-2016 23.05.09
        10        109 01-JAN-2016 23.05.20

10 rows selected.

SCOTT@orcl> -- interval in seconds:
SCOTT@orcl> VARIABLE p_seconds NUMBER
SCOTT@orcl> EXEC :p_seconds := 120

PL/SQL procedure successfully completed.

SCOTT@orcl> -- time windows:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
  2  	    FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds) windows
  3  FROM   time_window tw1,
  4  	    (SELECT MIN (datetime) mindatetime
  5  	     FROM   time_window) tw2
  6  ORDER  BY datetime
  7  /

        ID     REP_ID DATETIME                WINDOWS
---------- ---------- -------------------- ----------
         1        101 01-JAN-2016 23.01.01          0
         2        101 01-JAN-2016 23.01.02          0
         3        102 01-JAN-2016 23.01.40          0
         4        103 01-JAN-2016 23.02.04          0
         5        104 01-JAN-2016 23.02.05          0
         6        105 01-JAN-2016 23.02.06          0
         7        106 01-JAN-2016 23.04.00          1
         8        107 01-JAN-2016 23.04.10          1
         9        108 01-JAN-2016 23.05.09          2
        10        109 01-JAN-2016 23.05.20          2

10 rows selected.

SCOTT@orcl> -- count per window:
SCOTT@orcl> SELECT tw1.id, tw1.rep_id, tw1.datetime,
  2  	    COUNT (*) OVER
  3  	      (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds)) window_count
  4  FROM   time_window tw1,
  5  	    (SELECT MIN (datetime) mindatetime
  6  	     FROM   time_window) tw2
  7  ORDER  BY datetime
  8  /

        ID     REP_ID DATETIME             WINDOW_COUNT
---------- ---------- -------------------- ------------
         1        101 01-JAN-2016 23.01.01            6
         2        101 01-JAN-2016 23.01.02            6
         3        102 01-JAN-2016 23.01.40            6
         4        103 01-JAN-2016 23.02.04            6
         5        104 01-JAN-2016 23.02.05            6
         6        105 01-JAN-2016 23.02.06            6
         7        106 01-JAN-2016 23.04.00            2
         8        107 01-JAN-2016 23.04.10            2
         9        108 01-JAN-2016 23.05.09            2
        10        109 01-JAN-2016 23.05.20            2

10 rows selected.

SCOTT@orcl> -- flags:
SCOTT@orcl> SELECT id, rep_id, datetime,
  2  	    CASE WHEN (COUNT (*) OVER
  3  			 (PARTITION BY FLOOR (((tw1.datetime - tw2.mindatetime) * 24 * 60 * 60) / :p_seconds))) > 3
  4  		 THEN 12.2
  5  		 ELSE 12.1
  6  	    END flag
  7  FROM   time_window tw1,
  8  	    (SELECT MIN (datetime) mindatetime
  9  	     FROM   time_window) tw2
 10  ORDER  BY datetime
 11  /

        ID     REP_ID DATETIME                   FLAG
---------- ---------- -------------------- ----------
         1        101 01-JAN-2016 23.01.01       12.2
         2        101 01-JAN-2016 23.01.02       12.2
         3        102 01-JAN-2016 23.01.40       12.2
         4        103 01-JAN-2016 23.02.04       12.2
         5        104 01-JAN-2016 23.02.05       12.2
         6        105 01-JAN-2016 23.02.06       12.2
---------------------------------------------------------
         7        106 01-JAN-2016 23.04.00       12.1
         8        107 01-JAN-2016 23.04.10       12.1
---------------------------------------------------------
         9        108 01-JAN-2016 23.05.09       12.1
        10        109 01-JAN-2016 23.05.20       12.1

10 rows selected.

Re: Data Chucking based on Timewindow [message #647796 is a reply to message #647776] Mon, 08 February 2016 04:09 Go to previous message
razasheikh
Messages: 11
Registered: February 2016
Location: sdfdf
Junior Member

Thank you had tested that all the quires and I had got what I expected ---- can not expect more Smile

Thank you
Previous Topic: Triggers based Q....ion
Next Topic: wrong argument passed any suggestion
Goto Forum:
  


Current Time: Thu Mar 28 11:54:05 CDT 2024