RE: SQL question - most recent value for each group of data

From: <krish.hariharan_at_quasardb.com>
Date: Thu, 28 Feb 2008 22:52:37 -0700
Message-ID: <001501c87a97$4c89d190$6501a8c0@BHAIRAVIPC01>


With the assertion that dts is not indexed I wonder if we can do it more efficiently procedurally rather than relationally (SQL) to limit it to one full scan of the table.

Regards,
-Krish

President/Executive Architect, Quasar Database Technologies, LLC http://www.linkedin.com/in/quasardb

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Reardon, Bruce (RTABBAY)
Sent: Thursday, February 28, 2008 9:28 PM To: oracle-l_at_freelists.org
Subject: SQL question - most recent value for each group of data

I have a case where I need to return the most recent record for each group of data and I'm interested in how to do most efficiently.

Versions - Oracle 8.1.7.4 on Windows 2003R2 SP2

Example data:

id	name	value	dts
1	Al	2.3	28-Feb-2008 8:33
1	Al	2.5	28-Feb-2008 7:21
1	Fe	3.2	27-Feb-2008 4:55
2	Al	3.1	27-Feb-2008 3:22

The required results are:
id	name	val_at_max_dts	res_count
1	Al	2.3			2
1	Fe	3.2			1
2	Al	3.1			1

That is, for id=1, the most recent date is the 8:33 so the value returned is the 2.3.
The res_count is how many result records there were for that id and name.

Below I've got 2 queries that give the correct result - I'm interested in other options that may perform better. To help, the vast majority will have res_count = 1 and the max value I've come across is 9.
I've found the analytical query below often performs better, but in testing with extreme numbers of duplicates, the query with the derived table performed better.

The SQL I've got to reproduce this is below:

drop table str;
create table str ( id integer , name varchar2(40) , val real , dts date ) ;
create index str_idx1 ON str ( id , name ) ;

  • add ~100,000 samples insert into str select rownum , 'Al' , o1.object_id , sysdate-2 from dba_objects o1 , dba_objects o2 where o1.object_id < 100 and o2.object_id < 1000;
  • add 2nd + 3rd result to each insert into str select rownum , 'Fe' , o1.object_id , sysdate from dba_objects o1 , dba_objects o2 where o1.object_id < 100 and o2.object_id < 1000;

insert into str select rownum , 'pH' , 7 , sysdate from dba_objects o1 , dba_objects o2
where o1.object_id < 100 and o2.object_id < 1000;

  • every 100th to have a second Al insert into str select rownum * 100 , 'Al' , o1.object_id , sysdate -3 from dba_objects o1 where o1.object_id < 1000 ;
  • every 1000th to have third Al insert into str select rownum * 1000 , 'Al' , o1.object_id , sysdate -2 from dba_objects o1 where o1.object_id < 100 ;

commit;

exec dbms_stats.gather_table_stats ( ownname => USER , tabname => 'STR' , cascade => TRUE) ;

set autotrace traceonly

SELECT str1.id , str1.name , str1.val AS val_at_max_dts , smmry.max_dts , smmry.res_count AS
FROM str str1 ,

     ( SELECT str2.id , str2.name , MAX(str2.dts) AS max_dts , COUNT(*) as res_count

       FROM str str2
       GROUP BY str2.id , str2.name
     ) smmry
WHERE str1.id = smmry.id

  AND str1.name = smmry.name
  AND str1.dts = smmry.max_dts
-- AND str1.id BETWEEN 1000 and 2100

ORDER BY
   str1.id , str1.name
;

SELECT *
FROM
(SELECT str1.id , str1.name , str1.val AS val_at_max_dts , str1.dts ,

        RANK() OVER (PARTITION BY str1.id , str1.name ORDER BY str1.dts DESC) AS rank_dts ,

        ROW_NUMBER() OVER (PARTITION BY str1.id , str1.name ORDER BY str1.dts ASC) AS res_count
 FROM str str1
-- WHERE str1.id BETWEEN 1000 and 2100

 ORDER BY
   str1.id , str1.name
)
WHERE rank_dts = 1
;

In practise, there are 2 sets of tables Recent tables - table "s" with field id, table "t" as a child and the final child "r" with the value and date field. Repeated for another 3 tables with older data. The dts field is not indexed in the "r" tables.

Thanks,
Bruce Reardon

NOTICE
This e-mail and any attachments are private and confidential and may contain privileged information. If you are not an authorised recipient, the copying or distribution of this e-mail and any attachments is prohibited and you must not read, print or act in reliance on this e-mail or attachments. This notice should not be removed.
--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Thu Feb 28 2008 - 23:52:37 CST

Original text of this message