Home » SQL & PL/SQL » SQL & PL/SQL » help query in needed (Oracle 9i R2, Suse Linux)
help query in needed [message #351978] Sat, 04 October 2008 04:33 Go to next message
sajut
Messages: 69
Registered: January 2007
Member
Dear All
I have a table with columns
   col1    date
   col2    number

data
    1.1.2008      2000
    1.1.2008      1400
    1.2.2008      3500
    1.4.2008      2500
   10.4.2008      1000
   10.5.2008      3000


need a query to get an output like this
January-2008 3400
February-2008 6900
March-2008 6900
April-2008 10400
May-2008 13400

the query should sum up col2 upto each month end in a twelve month period. if there are no entries for a given month, then that months total will be previous months total.

I tried the following query in a pl/sql block and execute it twelve times
select sum(col2) from test where trunc(col1) <= to_date(x_date,'dd/mm/yyyy') ;


This takes about 6 - 7 mins on a table with 700,000 rows,
Any better alternative
Re: help query in needed [message #351985 is a reply to message #351978] Sat, 04 October 2008 10:03 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Use both SUM in its aggregate form (to group by month) and then its analytic form (to get the last 12 months).

Regards
Michel
Re: help query in needed [message #351987 is a reply to message #351978] Sat, 04 October 2008 10:21 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

group by in Straight SQL

Thumbs Up
Rajuvan

[Updated on: Sat, 04 October 2008 10:22]

Report message to a moderator

Re: help query in needed [message #351988 is a reply to message #351987] Sat, 04 October 2008 11:49 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Group by alone will not answer the second part:
Quote:
the query should sum up col2 upto each month end in a twelve month period.

(It could but with the overhead of a second scan.)

Regards
Michel
Re: help query in needed [message #351990 is a reply to message #351978] Sat, 04 October 2008 12:08 Go to previous messageGo to next message
Frank
Messages: 7880
Registered: March 2000
Senior Member
sajut wrote on Sat, 04 October 2008 11:33

select sum(col2) from test where trunc(col1) <= to_date(x_date,'dd/mm/yyyy') ;



If you have an index on col1 and you will still need the where-clause, even after applying the mechanisms Michel pointed you to, you should try to avoid using a function (trunc) on the indexed column.

You can change your example to
where col1 < to_date(x_date, 'dd/mm/yyyy') + 1
Re: help query in needed [message #352001 is a reply to message #351978] Sat, 04 October 2008 16:12 Go to previous messageGo to next message
dbawiseman
Messages: 3
Registered: October 2008
Location: England
Junior Member
To get the basic data you only need to scan the table once:

SELECT   TRUNC (col1,'MM') month
,        SUM (col2) total
FROM     tab1
WHERE    col1 >= TO_DATE ('01:01:2008','DD:MM:YYYY')
GROUP BY TRUNC (col1, 'MM')
ORDER BY 1;

MONTH           TOTAL
---------- ----------
01:01:2008       3400
01:02:2008       3500
01:04:2008       3500
01:05:2008       3000

The above part is the bit that will take the longest to execute. If there aren't any indices on col1 then the database will need to do a single full table scan of tab1. If you have an index on only col1 then the perfomance may be worse than a full table scan (typically you only want to use an index if you're querying less than around 10% of the rows in the table). If you have an index on (col1, col2) then Oracle's CBO should notice that it can get all the data it needs from just the index so it will be much quicker than a full table scan.

NOTE: Before you add the index on these columns check that other operations won't be significantly impacted. For example, if you only run this query once per month and the query goes from 5 minutes down to 1 second then you've saved nearly 5 minutes but any inserts, updates and deletes may well get degraded performance due to the extra index and this may add up to more than 5 minutes extra processing time per month.


You then need to add the missing months. The simplest way to do this is to "create" a 12-month window using a database object that you know has at least 12 rows in it, e.g. the dictionary view all_objects:

SELECT ADD_MONTHS (TO_DATE ('01:01:2008','DD:MM:YYYY'), ROWNUM - 1) month
,      0 total
FROM   all_objects
WHERE  ROWNUM < 13;

MONTH           TOTAL
---------- ----------
01:01:2008          0
01:02:2008          0
01:03:2008          0
01:04:2008          0
01:05:2008          0
01:06:2008          0
01:07:2008          0
01:08:2008          0
01:09:2008          0
01:10:2008          0
01:11:2008          0
01:12:2008          0

You can then merge the above two results into a single row per month by grouping by month and summing the total:

SELECT   RTRIM (TO_CHAR (month1, 'Month')) || TO_CHAR (month1,'-YYYY') month
,        SUM (total) OVER (ORDER BY month1) cum_total
FROM     (SELECT   month month1
          ,        month month2
          ,        SUM (total) total
          FROM     (SELECT   TRUNC (col1,'MM') month
                    ,        SUM (col2) total
                    FROM     tab1
                    WHERE    col1 >= TO_DATE ('01:01:2008','DD:MM:YYYY')
                    GROUP BY TRUNC (col1, 'MM')
                    UNION
                    SELECT ADD_MONTHS (TO_DATE ('01:01:2008','DD:MM:YYYY'), ROWNUM - 1) month
                    ,      0 total
                    FROM   all_objects
                    WHERE  ROWNUM < 13)
          GROUP BY month)
ORDER BY month2;

MONTH           CUM_TOTAL
-------------- ----------
January-2008         3400
February-2008        6900
March-2008           6900
April-2008          10400
May-2008            13400
June-2008           13400
July-2008           13400
August-2008         13400
September-2008      13400
October-2008        13400
November-2008       13400
December-2008       13400

The RTRIM and selecting the month twice in the outermost inline view was necessary because you probably want to keep the results in date order AND removing the right-padded months (if you use TO_CHAR(month ,'Month-YYYY') then you get data like
'May      -2008'

To test the performance I created a table with 700,000 rows in it with an index on both columns:

DECLARE
  x  PLS_INTEGER;
  y  PLS_INTEGER;
  rn NUMBER(20);
BEGIN
  SELECT hsecs
  INTO   rn
  FROM   gv$timer;
  dbms_random.initialize (rn);
  FOR i IN 1..700000
  LOOP
    x := (1+ABS(MOD(dbms_random.random,10000)));
    y := (1+ABS(MOD(dbms_random.random,100000)));
    INSERT INTO tab1 VALUES (SYSDATE - x, y);
    rn := x;
  END LOOP;
  dbms_random.terminate;
  COMMIT;
END;
/

CREATE INDEX tab1_ix1 ON tab1 (col1, col2);

The query on the 700,000 row table ran in well under a second on a desktop PC running 11g (you can see that the final 3 months all have the same cumulative total and the final 2 months didn't exist in tab1:

MONTH           CUM_TOTAL
-------------- ----------
January-2008    112996773
February-2008   208521325
March-2008      313186852
April-2008      416855686
May-2008        525972008
June-2008       631521453
July-2008       742076169
August-2008     854076699
September-2008  954372150
October-2008    964586272
November-2008   964586272
December-2008   964586272

12 rows selected.

Elapsed: 00:00:00.04

Depending on your version of Oracle you may need to add a /*+ NO_MERGE */ hint into one or more of the inline views as the CBO in older versions tending to be a little optimistic and end up using a completely inappropriate query plan.

To get an older 12-month window then you would need to change the date in the all_objects query and change the WHERE clause, e.g. WHERE col1 >= TO_DATE ('01:01:2007','DD:MM:YYYY') AND col1 < TO_DATE ('01:01:2008','DD:MM:YYYY')

Regards,

John
Re: help query in needed [message #352003 is a reply to message #352001] Sat, 04 October 2008 16:41 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
This will not work if there is more than 12 months of data.

Regards
Michel
Re: help query in needed [message #352006 is a reply to message #351978] Sat, 04 October 2008 18:49 Go to previous messageGo to next message
dbawiseman
Messages: 3
Registered: October 2008
Location: England
Junior Member
I'm not quite sure what you mean by your reply.

The query accurately calculates the cumulative totals for each month starting at 0, i.e. any rows outside of the 12-month window won't be considered in the cumulative total.

If you want to include all totals from earlier months then it is a simple change to the query I wrote:

SELECT   month1
,        cum_total
FROM     (SELECT   RTRIM (TO_CHAR (month1, 'Month')) || TO_CHAR (month1,'-YYYY') month1
          ,        month2
          ,        SUM (total) OVER (ORDER BY month1) cum_total
          FROM     (SELECT   month month1
                    ,        month month2
                    ,        SUM (total) total
                    FROM     (SELECT   TRUNC (col1,'MM') month
                              ,        SUM (col2) total
                              FROM     tab1
                              GROUP BY TRUNC (col1, 'MM')
                              UNION
                              SELECT ADD_MONTHS (TO_DATE ('01:01:2008','DD:MM:YYYY'), ROWNUM - 1) month
                              ,      0 total
                              FROM   all_objects
                              WHERE  ROWNUM < 13)
                    GROUP BY month))
WHERE    month2 >= TO_DATE ('01:01:2008','DD:MM:YYYY')
ORDER BY month2;

MONTH1           CUM_TOTAL
-------------- -----------
January-2008   34146766805
February-2008  34252164596
March-2008     34360931815
April-2008     34463758759
May-2008       34573006370
June-2008      34680397099
July-2008      34784792750
August-2008    34892274014
September-2008 34995352535
October-2008   35009196105
November-2008  35009196105
December-2008  35009196105

12 rows selected.

Elapsed: 00:00:01.06

The query takes longer to execute because in my example I have data going back over 27 years.

Regards,

John
Re: help query in needed [message #352011 is a reply to message #352006] Sun, 05 October 2008 01:40 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
The query accurately calculates the cumulative totals for each month starting at 0,

The question is:
Quote:
the query should sum up col2 upto each month end in a twelve month period.

I assume this means a sliding sum on 12 months not a sum up since point 0.

Regards
Michel
Re: help query in needed [message #352015 is a reply to message #351978] Sun, 05 October 2008 03:16 Go to previous messageGo to next message
dbawiseman
Messages: 3
Registered: October 2008
Location: England
Junior Member
In that case you only need change the cum_total line in the SQL so that it calculates the monthly totals only over the previous 12 months, i.e. "SUM (total) OVER (ORDER BY month1) cum_total" -> "SUM (total) OVER (ORDER BY month1 ROWS 11 PRECEDING) cum_total".

Only 11 preceding rows are chosen because the analytic function already includes the current row. Here's the latest evolution of the SQL:

SELECT   month1
,        cum_total
FROM     (SELECT   RTRIM (TO_CHAR (month1, 'Month')) || TO_CHAR (month1,'-YYYY') month1
          ,        month2
          ,        SUM (total) OVER (ORDER BY month1 ROWS 11 PRECEDING) cum_total
          FROM     (SELECT   month month1
                    ,        month month2
                    ,        SUM (total) total
                    FROM     (SELECT   TRUNC (col1,'MM') month
                              ,        SUM (col2) total
                              FROM     tab1
                              GROUP BY TRUNC (col1, 'MM')
                              UNION
                              SELECT ADD_MONTHS (TO_DATE ('01:01:2008','DD:MM:YYYY'), ROWNUM - 1) month
                              ,      0 total
                              FROM   all_objects
                              WHERE  ROWNUM < 13)
                    GROUP BY month))
WHERE    month2 >= TO_DATE ('01:01:2008','DD:MM:YYYY')
ORDER BY month2;

Depending on how you interpret the original requirement one of the three queries should give you the answer you want.

Regards,

John
Re: help query in needed [message #352017 is a reply to message #352015] Sun, 05 October 2008 03:43 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Good exercise, now it would be better if OP did it.
This site chooses to lead OPs to the solution and not to give them. You know the proverb "give a man a fish...".

By the way, there are now many neater way to generate calendar than using all_objects.
Have a look at Oracle Row Generator Techniques wiki page.

Regards
Michel
Re: help query in needed [message #352059 is a reply to message #351978] Sun, 05 October 2008 12:36 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
Hmm... So there are several opportunities the OP should be reading about.

1) SEQUENCE NUMBER GENERATORS
2) ORACLE ANALYTICS
3) how to do a DATA FILL
4) WITH CLAUSE

In addition to the links already suggested, please have a look at these:

Kevin Meade's OraFaq blog

The SQL WITH Clause, (Order Your Thoughts, Reuse Your Code)

A Simple Example of Oracle Analytics: Running Totals

If you find these links wanting, then please also do a search of ORAFAQ as there are others who have provided excellent learning materials on these subjects.

Good luck, Kevin
Re: help query in needed [message #352080 is a reply to message #352059] Mon, 06 October 2008 00:27 Go to previous message
sajut
Messages: 69
Registered: January 2007
Member
Thanks a lot to all members for their expert guidance and efforts. The answers gave me lot of insight into analytics, date generators etc. I could solve the problem using the last query submitted by dbawiseman

Thank once again

Saju
Previous Topic: Can you tell me the error reason in the script
Next Topic: date recorded as 'dd-mon-0008' instead of 'dd-mon-2008'
Goto Forum:
  


Current Time: Thu Dec 08 22:21:36 CST 2016

Total time taken to generate the page: 0.08543 seconds