Home » SQL & PL/SQL » SQL & PL/SQL » A question about SUM analytic function used with GROUP BY clause
icon5.gif  A question about SUM analytic function used with GROUP BY clause [message #397201] Thu, 09 April 2009 06:54 Go to next message
dariyoosh
Messages: 536
Registered: March 2009
Location: Iran / France
Senior Member
Dear all,

I have created the following table named myenterprise

CITY       STOREID    MONTH_NAME TOTAL_SALES            
---------- ---------- ---------- ---------------------- 
paris      id1        January    1000                   
paris      id1        March      7000                   
paris      id1        April      2000                   
paris      id2        November   2000                   
paris      id3        January    5000                   
london     id4        Janaury    3000                   
london     id4        August     6000                   
london     id5        September  500                    
london     id5        November   1000


If I want to find What is the total sales per city? I will run the following query
SELECT city, SUM(total_sales) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY;

which works pretty well and produces the desired result, that is,
CITY       TOTAL_SALES_PER_CITY   
---------- ---------------------- 
london     10500                  
paris      17000            


Now in one of my SQL books (Mastering Oracle SQL) I have found another method by using SUM but this time as an analytic function. Here is what the book method suggests as an alternative solution for the problem:
SELECT city, 
       SUM(SUM(total_sales)) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY;


I know that analytic functions are run after the GROUP BY clause has been processed completely and unlike ordinary aggregate functions, they return their result for each row belonging to the partitions specified in the partition clause (if there is a partition clause defined).

Now my problem is that I don't understand what we must use two SUM functions? if we use only one, that is,
SELECT city, 
       SUM(total_sales) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY;

This will produce the following error:
Error starting at line 2 in command:
SELECT city, 
       SUM(total_sales) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY
Error at Command Line:2 Column:11
Error report:
SQL Error: ORA-00979: not a GROUP BY expression
00979. 00000 -  "not a GROUP BY expression"
*Cause:    
*Action:

The error is generated for the line 2 column 11 that is, for the expression SUM(total_sales), well it's true that total_sales doesn't appear in the GROUP BY clause, but this should not cause a problem, as it has been used in an analytic function, so it is evaluated after the GROUP BY clause.

So here is my question:

Why using SUM(SUM(total_sales)) instead of SUM(total_sales)?


Thanks in advance!
Smile





In the case where you're interested, this is my table definition:
DROP TABLE myenterprise;
CREATE TABLE myenterprise(
city VARCHAR2(10), 
storeid VARCHAR2(10),
month_name VARCHAR2(10),
total_sales NUMBER);

INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('paris', 'id1', 'January', 1000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('paris', 'id1', 'March', 7000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('paris', 'id1', 'April', 2000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('paris', 'id2', 'November', 2000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('paris', 'id3', 'January', 5000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('london', 'id4', 'Janaury', 3000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('london', 'id4', 'August', 6000);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('london', 'id5', 'September', 500);
INSERT INTO myenterprise(city, storeid, month_name, total_sales)
  VALUES ('london', 'id5', 'November', 1000);
Re: A question about SUM analytic function used with GROUP BY clause [message #397205 is a reply to message #397201] Thu, 09 April 2009 07:05 Go to previous messageGo to next message
Frank
Messages: 7880
Registered: March 2000
Senior Member
What happened when you executed the query with the double sum?
Your own query with the analytic function errorred out because analytical functions don't use group by.
Remove the group by to see what the analytical function does by itself.
Re: A question about SUM analytic function used with GROUP BY clause [message #397210 is a reply to message #397205] Thu, 09 April 2009 07:48 Go to previous messageGo to next message
dariyoosh
Messages: 536
Registered: March 2009
Location: Iran / France
Senior Member
Frank wrote on Thu, 09 April 2009 14:05
What happened when you executed the query with the double sum?
Your own query with the analytic function errorred out because analytical functions don't use group by.
Remove the group by to see what the analytical function does by itself.



Hello there,

First of all thank you very much for this nice remark: "analytic functions don't use group by" I didn't know that (I'm completely newbie in analytic functions)

According to the table values that I provided in my first post, the following works pretty well for me:

SELECT city, 
       SUM(SUM(total_sales)) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY;


and produces the desired result, that is,
CITY       TOTAL_SALES_PER_CITY   
---------- ---------------------- 
london     10500                  
paris      17000            



What generates the error is this:
SELECT city, 
          SUM(total_sales)) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY;


And according to what you mentioned, I think we can say the reason is that SUM(total_sales)) OVER (PARTITION BY city)(using only one SUM instead of two) is an analytic function and therefore cannot be used in a query with GROUP BY clause.
Am I right?

Regards,
Re: A question about SUM analytic function used with GROUP BY clause [message #397214 is a reply to message #397210] Thu, 09 April 2009 08:02 Go to previous messageGo to next message
cookiemonster
Messages: 12409
Registered: September 2008
Location: Rainy Manchester
Senior Member
dariyoosh wrote on Thu, 09 April 2009 13:48

According to the table values that I provided in my first post, the following works pretty well for me:

SELECT city, 
       SUM(SUM(total_sales)) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
FROM myenterprise
GROUP BY city
ORDER BY city, TOTAL_SALES_PER_CITY;


and produces the desired result, that is,
CITY       TOTAL_SALES_PER_CITY   
---------- ---------------------- 
london     10500                  
paris      17000            




Really?

Doesn't for me:
Connected to:
Oracle Database 10g Release 10.2.0.2.0 - 64bit Production

SQL> SELECT city,
  2  sum(SUM(total_sales) OVER (PARTITION BY city)) AS TOTAL_SALES_PER_CITY
  3  FROM myenterprise
  4  GROUP BY city
  5  ORDER BY city, TOTAL_SALES_PER_CITY;
sum(SUM(total_sales) OVER (PARTITION BY city)) AS TOTAL_SALES_PER_CITY
    *
ERROR at line 2:
ORA-30483: window  functions are not allowed here

SQL> 


Aggregates group data from multiple rows together - the group by tells them how to group the records.
Analytics don't group records together.
They are not interchangable.

The analytic version goes like this:
SQL> SELECT city,
  2         SUM(total_sales) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
  3  FROM myenterprise
  4  ORDER BY city, TOTAL_SALES_PER_CITY;

CITY       TOTAL_SALES_PER_CITY
---------- --------------------
london                    10500
london                    10500
london                    10500
london                    10500
paris                     17000
paris                     17000
paris                     17000
paris                     17000
paris                     17000

9 rows selected.

SQL> 


1 row for each row in the table.
Re: A question about SUM analytic function used with GROUP BY clause [message #397218 is a reply to message #397214] Thu, 09 April 2009 08:22 Go to previous messageGo to next message
Frank
Messages: 7880
Registered: March 2000
Senior Member
cookiemonster wrote on Thu, 09 April 2009 15:02


Really?

Doesn't for me:


I thought it would be a weird result too. Expected either an error or a sum of the sums, but never a sum of the values.
Re: A question about SUM analytic function used with GROUP BY clause [message #397221 is a reply to message #397214] Thu, 09 April 2009 08:45 Go to previous messageGo to next message
dariyoosh
Messages: 536
Registered: March 2009
Location: Iran / France
Senior Member
cookiemonster wrote on Thu, 09 April 2009 15:02

Really?

Doesn't for me:
Connected to:
Oracle Database 10g Release 10.2.0.2.0 - 64bit Production

SQL> SELECT city,
  2  sum(SUM(total_sales) OVER (PARTITION BY city)) AS TOTAL_SALES_PER_CITY
  3  FROM myenterprise
  4  GROUP BY city
  5  ORDER BY city, TOTAL_SALES_PER_CITY;
sum(SUM(total_sales) OVER (PARTITION BY city)) AS TOTAL_SALES_PER_CITY
    *
ERROR at line 2:
ORA-30483: window  functions are not allowed here

SQL> 

...



It doesn't work for you because you wrote:
sum(SUM(total_sales) OVER (PARTITION BY city)) AS TOTAL_SALES_PER_CITY

whereas you have to write:
sum(SUM(total_sales)) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
/*
you made a mistake and you put the parenthesis around the whole analytic function.
*/


About your remark that analytic functions cannot be in queries including GROUP BY clause, here is an extract of a page on the Oracle documentation website presenting the analytic functions:

http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14200/functions001.htm
Quote:

Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in the select list or ORDER BY clause.



The fact that the document states that all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions doesn't implicitly mean that the analytic functions can actually be in the queries having GROUP BY cluase? otherwise what's the point of mentioning that they're processed after GROUP BY, WHERE and HAVING clauses? If there can never be in a same query why we need to know this order of process?

[Updated on: Thu, 09 April 2009 08:46]

Report message to a moderator

Re: A question about SUM analytic function used with GROUP BY clause [message #397224 is a reply to message #397221] Thu, 09 April 2009 09:10 Go to previous messageGo to next message
Frank
Messages: 7880
Registered: March 2000
Senior Member
If you indeed do a sum(sum(x)) over partition by x) group by x
then you just added a layer of nothing: you calculate totals of single records.
Re: A question about SUM analytic function used with GROUP BY clause [message #397226 is a reply to message #397201] Thu, 09 April 2009 09:16 Go to previous messageGo to next message
cookiemonster
Messages: 12409
Registered: September 2008
Location: Rainy Manchester
Senior Member
And I thought I'd copied and pasted...

However this:

SQL> SELECT city, 
  2         SUM(SUM(total_sales)) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
  3  FROM myenterprise
  4  GROUP BY city
  5  ORDER BY city, TOTAL_SALES_PER_CITY;

CITY       TOTAL_SALES_PER_CITY
---------- --------------------
london                    10500
paris                     17000

Gives exactly the same result as this:
SQL> SELECT city, 
  2         SUM(total_sales)AS TOTAL_SALES_PER_CITY
  3  FROM myenterprise
  4  GROUP BY city
  5  ORDER BY city, TOTAL_SALES_PER_CITY;

CITY       TOTAL_SALES_PER_CITY
---------- --------------------
london                    10500
paris                     17000


The only difference is that your version (the first one) does an extra unnecessary step to get the result.

And you're right you can do analytics on grouped results it's just that in your example it's meaningless to do so.

Your query isn't really much use for demonstrating analytics.

Here's a better example:

SQL> SELECT city, storeid, month_name, total_sales,
  2  SUM(total_sales) OVER (PARTITION BY city, storeid) AS TOTAL_SALES_PER_STORE,
  3  SUM(total_sales) OVER (PARTITION BY city) AS TOTAL_SALES_PER_CITY
  4  FROM myenterprise;
  
CITY       STOREID    MONTH_NAME TOTAL_SALES TOTAL_SALES_PER_STORE TOTAL_SALES_PER_CITY
---------- ---------- ---------- ----------- --------------------- --------------------
london     id4        Janaury           3000                  9000                10500
london     id4        August            6000                  9000                10500
london     id5        November          1000                  1500                10500
london     id5        September          500                  1500                10500
paris      id1        January           1000                 10000                17000
paris      id1        March             7000                 10000                17000
paris      id1        April             2000                 10000                17000
paris      id2        November          2000                  2000                17000
paris      id3        January           5000                  5000                17000



Using analytics we can see data summed over multiple rows against each individual row that contributes to it.

But if all you want is the sum per city you'd use the aggregate.

And this:

Quote:

Aggregates group data from multiple rows together - the group by tells them how to group the records.
Analytics don't group records together.
They are not interchangable.



is still true.
Re: A question about SUM analytic function used with GROUP BY clause [message #397236 is a reply to message #397226] Thu, 09 April 2009 09:41 Go to previous messageGo to next message
dariyoosh
Messages: 536
Registered: March 2009
Location: Iran / France
Senior Member
cookiemonster wrote on Thu, 09 April 2009 16:16
And
...
And you're right you can do analytics on grouped results it's just that in your example it's meaningless to do so.
...




Hello there,

Thank you very much for your description, and yes you're right I didn't chose a pertinent example

cookiemonster wrote on Thu, 09 April 2009 16:16
And
...
And you're right you can do analytics on grouped results it's just that in your example it's meaningless to do so.
...



However, according to the fact that
SUM(SUM(total_sales)) OVER (PARTITION BY city)

provides the same result as the other query, is it correct to conclude that whenever an analytic function is used in a query including a GROUP BY clause, then it's aggregate function must be applied twice? For example in this example
...
SUM(total_sales OVER (PARTITION BY city)
...
GROUP BY city

This didn't work

but when we wrote:
...
SUM(SUM(total_sales)) OVER (PARTITION BY city)
...
GROUP BY city

The query was accepted, so does this mean that the aggregate part in the analytic function must be applied twice whenever there is a GROUP BY clause in the query?
Re: A question about SUM analytic function used with GROUP BY clause [message #397239 is a reply to message #397236] Thu, 09 April 2009 09:55 Go to previous messageGo to next message
cookiemonster
Messages: 12409
Registered: September 2008
Location: Rainy Manchester
Senior Member
dariyoosh wrote on Thu, 09 April 2009 15:41
but when we wrote:
...
SUM(SUM(total_sales)) OVER (PARTITION BY city)
...
GROUP BY city

The query was accepted, so does this mean that the aggregate part in the analytic function must be applied twice whenever there is a GROUP BY clause in the query?


No it's done once at the end.
Might help if you think of it like this:
SELECT aggregated_city,
       SUM(AGGREGATED_SALES_PER_CITY) OVER (PARTITION BY aggregated_city) AS POINTLESS_ANALYTIC
FROM (SELECT city AS aggregated_city, 
             SUM(total_sales) AS AGGREGATED_SALES_PER_CITY
      FROM myenterprise
      GROUP BY city)
ORDER BY aggregated_city, POINTLESS_ANALYTIC;


There are probably cases where you'd want to use aggregates and analytics in the same query but they'd be far more complicated examples than what we're looking at and I can't think of any.

To go right back to your original question:
Quote:

Why using SUM(SUM(total_sales)) instead of SUM(total_sales)?



You just wouldn't.

icon10.gif  Re: A question about SUM analytic function used with GROUP BY clause [message #397245 is a reply to message #397239] Thu, 09 April 2009 10:12 Go to previous messageGo to next message
dariyoosh
Messages: 536
Registered: March 2009
Location: Iran / France
Senior Member
cookiemonster wrote on Thu, 09 April 2009 16:55
dariyoosh wrote on Thu, 09 April 2009 15:41
but when we wrote:
...
SUM(SUM(total_sales)) OVER (PARTITION BY city)
...
GROUP BY city

The query was accepted, so does this mean that the aggregate part in the analytic function must be applied twice whenever there is a GROUP BY clause in the query?


No it's done once at the end.
Might help if you think of it like this:
SELECT aggregated_city,
       SUM(AGGREGATED_SALES_PER_CITY) OVER (PARTITION BY aggregated_city) AS POINTLESS_ANALYTIC
FROM (SELECT city AS aggregated_city, 
             SUM(total_sales) AS AGGREGATED_SALES_PER_CITY
      FROM myenterprise
      GROUP BY city)
ORDER BY aggregated_city, POINTLESS_ANALYTIC;


There are probably cases where you'd want to use aggregates and analytics in the same query but they'd be far more complicated examples than what we're looking at and I can't think of any.

To go right back to your original question:
Quote:

Why using SUM(SUM(total_sales)) instead of SUM(total_sales)?



You just wouldn't.





Thank you very much for your answer and the time that you spent, it's now clear particularly after POINTLESS_ANALYTIC query that you wrote. Yes in the first subquery the GROUP BY does the job, therefore the PARTITION BY in the second SUM does something really unnecessary.

Thank you very much!

Regards,
icon10.gif  Re: A question about SUM analytic function used with GROUP BY clause [message #397246 is a reply to message #397224] Thu, 09 April 2009 10:12 Go to previous message
dariyoosh
Messages: 536
Registered: March 2009
Location: Iran / France
Senior Member
Frank wrote on Thu, 09 April 2009 16:10
If you indeed do a sum(sum(x)) over partition by x) group by x
then you just added a layer of nothing: you calculate totals of single records.



Thanks a lot for your help

Regards,
Previous Topic: account lock privileges
Next Topic: Using WHEN COUNT function in conjunction with Partition By
Goto Forum:
  


Current Time: Tue Dec 06 14:30:05 CST 2016

Total time taken to generate the page: 0.29706 seconds