Home » SQL & PL/SQL » SQL & PL/SQL » Counting the most recent record
Counting the most recent record [message #222675] Mon, 05 March 2007 21:43 Go to next message
Badger
Messages: 2
Registered: March 2007
Junior Member
My problem: I'm trying to produce some counts based on the most recent period date for a given record from a fact table consisting of 9 million records. I have devised a number of queries that do generate the counts we need but they are
pretty slow, anywhere from 2 to 3 minutes. I'm sure there is a more efficient way to write my SQL query but I'm not sure how to do it. The columns I am using are all indexed and are part of the primary key.
Here is an example of the data that would be in the table

FNAME LNAME PERSON_ID ORG_CD PDT

HOWARD STRAND 10 A 200309
HOWARD STRAND 10 A 200310
HOWARD STRAND 10 B 200311
HOWARD STRAND 10 C 200312 <--- Keep

NINA LUSK 12 A 200201 <--Keep

BOB BOBEBERSON 15 B 199806
BOB BOBEBERSON 15 B 200301 <--Keep

In each case I want to count only the current record based on PDT.
Here is query one:

SELECT org_cd, count(*)
FROM dw_xxx_fact1_table
WHERE ID||pdt IN (SELECT ID||max(pdt)
OVER (PARTITION BY ID)
FROM dw_xxx_fact1_table
WHERE pgm_typ_cd = '2')
AND pgm_typ_cd = '2'
GROUP BY org_cd

I kind of create a composite key by appending the ID and PDT together, then in the sub query I do the same thing but append max pdt. It just seems so ugly but it does produce the counts that are consistent with our reference numbers. The pgm_typ_cd better targets the population but in the next query I thought it best to target the actual partition which really didn't
improve the performance all that much. In both cases I think it is way too I/O intensive.

SELECT org_cd, count(*)
FROM dw_xxx_fact1_table partition (dw_xxx_fact1_data)
WHERE ID||pdt IN (SELECT ID||max(pdt)
OVER (PARTITION BY ID)
FROM dw_xxx_fact1_table partition
(dw_xxx_fact1_data))
GROUP BY org_cd;

Any SQL experts out there have any suggestions? I really appreciate the
help. Thanks in advance
Re: Counting the most recent record [message #222690 is a reply to message #222675] Tue, 06 March 2007 00:11 Go to previous messageGo to next message
flyboy
Messages: 1832
Registered: November 2006
Senior Member
I would get rid of the IN clause and rewrite it to
SELECT org_cd, count(*)
FROM (SELECT org_cd,
    ROW_NUMBER() OVER (PARTITION BY id ORDER BY pdt DESC) rn
  FROM dw_xxx_fact1_table
  WHERE pgm_typ_cd = '2')
WHERE rn = 1
GROUP BY org_cd;

If you have duplicates on (id, pdt) and you want all rows with max pdt for given id, use RANK instead of ROW_NUMBER.
Re: Counting the most recent record [message #222837 is a reply to message #222675] Tue, 06 March 2007 10:57 Go to previous message
Badger
Messages: 2
Registered: March 2007
Junior Member
Thanks so much. It generated our predicted counts and knocked a minute off the time. I need to rethink how we do our queries. I'm too comfortable with sub queries. Thanks again!!!
Previous Topic: DATABASE REFRESH
Next Topic: specifying path for output from spooled file
Goto Forum:
  


Current Time: Fri Dec 09 19:07:17 CST 2016

Total time taken to generate the page: 0.10756 seconds