Home » SQL & PL/SQL » SQL & PL/SQL » Query to find the user who did not login from past 3 years (Oracle 11.2.0.2.0)
Query to find the user who did not login from past 3 years [message #569855] Fri, 02 November 2012 09:16 Go to next message
yuko
Messages: 65
Registered: August 2011
Member
Hi,
I have two table APPLICATION_1 and APPLICATION_2, where user's login date is captured.
There will be multiple records for each user.
Same user may or may not be present in both tables.
The requirement is to find those users who have not logged-in in the last 3 years.
-- Test Case
DROP TABLE application_1;
DROP TABLE application_2;
CREATE TABLE application_1
(
   seq NUMBER , 
   user_id VARCHAR2(30), 
   login_date DATE
);

CREATE TABLE application_2
(
   seq NUMBER , 
   user_id VARCHAR2(30), 
   login_date DATE 
);
INSERT INTO application_1
VALUES(12, 'A1', SYSDATE - (365 * 4));
INSERT INTO application_1
VALUES(11, 'A1', SYSDATE - (365 * 5));
INSERT INTO application_1
VALUES(10, 'A1', SYSDATE - (365 * 6));
INSERT INTO application_1
VALUES(9, 'A1', SYSDATE - (365 * 7));
INSERT INTO application_1
VALUES(8, 'B1', SYSDATE - (365 * 1));
INSERT INTO application_1
VALUES(7, 'B1', SYSDATE - (365 * 4));
INSERT INTO application_1
VALUES(6,'B1', SYSDATE - (365 * 5));
INSERT INTO application_1
VALUES(5,'B1', SYSDATE - (365 * 6));
INSERT INTO application_1
VALUES(4, 'B1', SYSDATE - (365 * 7));
INSERT INTO application_1
VALUES(3, 'G1', SYSDATE - 50 );
INSERT INTO application_1
VALUES(2, 'D1', SYSDATE - (365 * 3) - 3);
INSERT INTO application_1
VALUES(1, 'L1', SYSDATE - (365 * 3) - 5);
-------------------------------------------
INSERT INTO application_2
VALUES(18, 'C1', SYSDATE - (365 * 1));
INSERT INTO application_2
VALUES(17, 'C1', SYSDATE - (365 * 4));
INSERT INTO application_2
VALUES(16, 'A1', SYSDATE - (365 * 1));
INSERT INTO application_2
VALUES(15, 'D1', SYSDATE - (365 * 4));
INSERT INTO application_2
VALUES(14, 'B1', SYSDATE - (365 * 5));
INSERT INTO application_2
VALUES(13, 'F1', SYSDATE );

Using the below query I'm able to get the desired output.
SELECT user_id, MAX(login_date) last_login_date FROM (
	SELECT user_id, MAX(login_date) login_date FROM application_1 GROUP BY user_id
	UNION ALL  
	SELECT user_id, MAX(login_date) login_date FROM application_2 GROUP BY user_id
) GROUP BY user_id
HAVING MAX(login_date) <= SYSDATE-(365*3);

USER_ID                        LAST_LOGI
------------------------------ ---------
D1                             31-OCT-09
L1                             29-OCT-09

Could you please suggest me a better way for this requirement.

Thank you,
Re: Query to find the user who did not login from past 3 years [message #569857 is a reply to message #569855] Fri, 02 November 2012 09:30 Go to previous messageGo to next message
BlackSwan
Messages: 22554
Registered: January 2009
Senior Member
>The requirement is to find those users who have not logged-in in the last 3 years.
explain in words how to programatically determine which records meet this criteria?
Re: Query to find the user who did not login from past 3 years [message #569862 is a reply to message #569855] Fri, 02 November 2012 10:10 Go to previous messageGo to next message
Michel Cadot
Messages: 58638
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
A user that has NEVER connected is a user did not connect in the last 3 years but with what you posted you can't know him. For instance, you can't know that MICHEL has not connected in the last 3 years.

Quote:
Could you please suggest me a better way for this requirement


You must FIRST have a table that list the users.

Regards
Michel
Re: Query to find the user who did not login from past 3 years [message #569868 is a reply to message #569862] Fri, 02 November 2012 11:12 Go to previous messageGo to next message
yuko
Messages: 65
Registered: August 2011
Member
We have a web application, whenever a user logs in, one record will be inserted in either Application_1 or Application_2 table. There is another master table which contains all application user`s details. But i,m not interested in the master table. Only by using these 2 tables those users should be retrieved whose lastest login date is 3 years back.

Thank you.
Re: Query to find the user who did not login from past 3 years [message #569869 is a reply to message #569868] Fri, 02 November 2012 11:39 Go to previous messageGo to next message
Michel Cadot
Messages: 58638
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Your query is correct and anyway the current optimizer very well know how to rewrite this kind of query if it needs to.
The only point is that -3*365 is not 3 years back; 3 years back is add_months(.,-3*12).

Regards
Michel
Re: Query to find the user who did not login from past 3 years [message #569873 is a reply to message #569869] Fri, 02 November 2012 11:55 Go to previous messageGo to next message
yuko
Messages: 65
Registered: August 2011
Member
Thank you Michel,
I'll modify the query to use add_months.
Re: Query to find the user who did not login from past 3 years [message #569877 is a reply to message #569868] Fri, 02 November 2012 12:40 Go to previous messageGo to next message
Solomon Yakobson
Messages: 1966
Registered: January 2010
Senior Member
yuko wrote on Fri, 02 November 2012 12:12
Only by using these 2 tables those users should be retrieved whose lastest login date is 3 years back.


Then, your query (as others already pointed out) will returt users that logged in at least once but more that 3 years ago, not users did not login for the last 3 years. Anyway, even though your query produces correct results, it has excessive grouping. No need to GROUP BY each part of the UNION ALL:

SELECT  user_id,
        MAX(login_date) last_login_date
  FROM  (
          SELECT  user_id,
                  login_date
            FROM  application_1
         UNION ALL
          SELECT  user_id,
                  login_date
            FROM  application_2
        )
  GROUP BY user_id
  HAVING MAX(login_date) <= ADD_MONTHS(SYSDATE,-36)
/


SY.
Re: Query to find the user who did not login from past 3 years [message #569878 is a reply to message #569877] Fri, 02 November 2012 12:50 Go to previous messageGo to next message
Michel Cadot
Messages: 58638
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
That is a matter of discussion, in one side you group 2 sets then a small set, in another one you group only once but a bigger set.
Depending on the case (number of rows, available size of PGA or sortarea...), one is better or the other one.
The same discussion can be done on the HAVING clause.
But as I said, the current optimizer very well know how to rewrite the query (assuming, of course, the statistics are accurate).

Regards
Michel
Re: Query to find the user who did not login from past 3 years [message #569881 is a reply to message #569869] Fri, 02 November 2012 12:55 Go to previous messageGo to next message
Solomon Yakobson
Messages: 1966
Registered: January 2010
Senior Member
Michel Cadot wrote on Fri, 02 November 2012 12:39
and anyway the current optimizer very well know how to rewrite this kind of query if it needs to


I wish we'd have that silver bullet:

SQL> exec dbms_stats.gather_table_stats('SCOTT','APPLICATION_1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('SCOTT','APPLICATION_2');

PL/SQL procedure successfully completed.

SQL> EXPLAIN PLAN FOR
  2  SELECT  user_id, MAX(login_date) last_login_date FROM (
  3   SELECT user_id, MAX(login_date) login_date FROM application_1 GROUP BY user_id
  4   UNION ALL  
  5   SELECT user_id, MAX(login_date) login_date FROM application_2 GROUP BY user_id
  6  ) GROUP BY user_id
  7  HAVING MAX(login_date) <= SYSDATE-(365*3);

Explained.

SQL> @?\rdbms\admin\utlxpls

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 712574724

----------------------------------------------------------------------------------------
| Id  | Operation              | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |               |     1 |    12 |     9  (34)| 00:00:01 |
|*  1 |  FILTER                |               |       |       |            |          |
|   2 |   HASH GROUP BY        |               |     1 |    12 |     9  (34)| 00:00:01 |
|   3 |    VIEW                |               |    10 |   120 |     8  (25)| 00:00:01 |
|   4 |     UNION-ALL          |               |       |       |            |          |
|   5 |      HASH GROUP BY     |               |     5 |    55 |     4  (25)| 00:00:01 |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
|   6 |       TABLE ACCESS FULL| APPLICATION_1 |    12 |   132 |     3   (0)| 00:00:01 |
|   7 |      HASH GROUP BY     |               |     5 |    55 |     4  (25)| 00:00:01 |
|   8 |       TABLE ACCESS FULL| APPLICATION_2 |     6 |    66 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(MAX("LOGIN_DATE")<=SYSDATE@!-1095)

20 rows selected.

SQL> EXPLAIN PLAN FOR
  2  SELECT  user_id,
  3          MAX(login_date) last_login_date
  4    FROM  (
  5            SELECT  user_id,
  6                    login_date
  7              FROM  application_1
  8           UNION ALL
  9            SELECT  user_id,
 10                    login_date
 11              FROM  application_2
 12          )
 13    GROUP BY user_id
 14    HAVING MAX(login_date) <= ADD_MONTHS(SYSDATE,-36)
 15  /

Explained.

SQL> @?\rdbms\admin\utlxpls

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 3262806368

---------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |     1 |    12 |     7  (15)| 00:00:01 |
|*  1 |  FILTER               |               |       |       |            |          |
|   2 |   HASH GROUP BY       |               |     1 |    12 |     7  (15)| 00:00:01 |
|   3 |    VIEW               |               |    18 |   216 |     6   (0)| 00:00:01 |
|   4 |     UNION-ALL         |               |       |       |            |          |
|   5 |      TABLE ACCESS FULL| APPLICATION_1 |    12 |   132 |     3   (0)| 00:00:01 |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
|   6 |      TABLE ACCESS FULL| APPLICATION_2 |     6 |    66 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(MAX("LOGIN_DATE")<=ADD_MONTHS(SYSDATE@!,(-36)))

18 rows selected.

SQL>  


SY.
Re: Query to find the user who did not login from past 3 years [message #569883 is a reply to message #569878] Fri, 02 November 2012 13:07 Go to previous messageGo to next message
Solomon Yakobson
Messages: 1966
Registered: January 2010
Senior Member
Michel Cadot wrote on Fri, 02 November 2012 13:50
That is a matter of discussion, in one side you group 2 sets then a small set, in another one you group only once but a bigger set.


And? The way I see it. It is other way around. You start grouping once. If combined volume produced by UNION ALL is too high for in memory sort, then we'd look to tune it by making multiple sorts. But tha's if SORT GROUP BY would be used. Here optimizer uses HASH GROUP BY.

SY.
Re: Query to find the user who did not login from past 3 years [message #569884 is a reply to message #569883] Fri, 02 November 2012 13:45 Go to previous message
Michel Cadot
Messages: 58638
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
There are too few rows to conclude anything, and yes the optimizer knows how to rewrite that; at least ways were described in some white papers Oracle published on their current development when 10.2 was released. As far as I remember there were a couple of examples where aggregate function was moved from outer to inner queries and/or the opposite, I think if they were not implemented in 10.2 they are now in 11.2.
Note that the paper mentioned that the optimizer evaluates the cost to analyze and stops query transformation if the query estimated execution cost is lower than the estimated query optimization cost.

Regards
Michel
Previous Topic: rownum > 5 and count(*)
Next Topic: sql developer seprate date and time
Goto Forum:
  


Current Time: Thu Jul 31 20:22:06 CDT 2014

Total time taken to generate the page: 0.09891 seconds