Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: Is ROLLUP really better than doing a nested query when rank is involved?
<csimam_at_gmail.com> wrote in message
news:1188875136.958875.206220_at_y42g2000hsy.googlegroups.com...
> Here is a thread I posted elsewhere that I wonder if someone here
> could shed some light on...
>
> The original question I had:
>
> --- BEGIN ---
> Which one is better?
> 1. To perform aggregate queries on data, and use that data to home in
> and perform more detailed analysis? This gives result sets that are
> potentially filled with columns that contain the aggregate data
> repeated across many rows.
> I meant here by homing in: within the same query - using
> nested SELECTS. For example, the inner query performs the aggregate
> analysis, and returns whatever data we are interested in, and then
> the
> outer query performs another SELECT on the same data, resulting in
> the
> aggregate data being distributed across 'detailed results' from the
> outer query.
>
> 2. To perform the aggregate query once, and then hit the database
> again for a detailed analysis, and return the detailed analysis
> results in an order that is easily cross-correlated with the original
> aggregate data? This means running the aggregate query essentially
> twice, just the second time it is hidden as a subquery.
> --- END ---
>
> I realize the question was vague, but a super DBA understood what I
> was asking and suggested I study up on ROLLUP and CUBE (I am not a DBA
> by the way). So I did... and here is what I found:
>
> --- BEGIN ---
> The data I have, I want to generate totals and subtotals, say of
> sales: total sales per store, and subtotal of sales per store per day
> for example.
>
> The problem is that I ultimately only want the top 25 stores in total
> sales BUT with their subtotals as well... So I used ROLLUP to
> generate
> the totals and subtotals... and in the same query I generated a
> ranking of the data... and there I found my problem: the ranking does
> not distinguish between totals and subtotals. So the results look
> like:
>
> store1, realTotal, 1
> store1, subTotal, 2
> store2, realTotal, 3
> store7, realTotal, 4
> store10, realTotal, 5
> store10, subTotal, 6
>
> Clearly, the rows with subtotals should NOT be ranked... but I am
> struggling with SQL to figure out how to get exactly what I want.
>
> Well... I have a solution. But I'm not sure about it. It involves
> something I learned about only recently: a user-defined aggregate
> function that concatenates values from across rows. Using this
> aggregate function, stragg, I can do another pass through the results
> mentioned above while grouping by store, and produce:
>
> store1, 'realTotal subTotal', 1
> store2, 'realTotal subTotal', 3
> store7, 'realTotal subTotal', 4
> store10, 'realTotal subTotal', 5
>
> And another trick: use min( rank ) in the same aggregation, and order
> by it as well.
>
> Together that gives one a very compact result set that has everything
> ranked and grouped perfectly.
>
> ...but... a worry: is this REALLY better than doing a nested query? I
> wonder if the use of the user-defined aggregate function is very
> expensive...
>
> For the benefit of anyone wondering, here's the structure that
> works... in one pass in theory... but again, I am now wondering if it
> really is faster than the nested query equivalent that hits the
> database twice on the table 'stores' (seems to me that the middle
> select below is effectively hitting the table again, just in
> temporary
> form):
>
> select rownum as rank, id, days, totals
> from
> (
> select id, stragg(day) as days, stragg(total) as totals,
> min( rank ) as innerrank
> from
> (
> select id, day, sum(sales) as total, rank() over (order by
> sum(sales) desc) rank
> from stores
> group by id, rollup(day)
> )
> group by id
> order by min(rank)
> )
> where rownum <= 25
>
> --- END ---
>
> Any comments? Is this query really better than the following:
>
> -- get totals by day
> select rank, id, day, total, sum(sales) as dailytotal
> from
> (
> select * from
> (
> -- get grand totals regardless of day
> select id as storeid, sum( sales ) as total,
> rank() over (order by sum(sales) desc) rank
> from stores
> group by id
> )
> where rank <= 25
> ), stores
> -- NOTE join with stores again
> where id = storeid
> -- but does the where condition above clip the query and make it
> efficient?
> group by rank, id, day, total
> order by rank, day
>
This *does not* directly answer your question, but with the following data:
SQL> select * from store;
STORE_ID TRANS_DATE TRANS_AMT
---------- ----------- ------------
1 2007/09/01 10.00 1 2007/09/02 20.00 1 2007/09/03 30.00 1 2007/09/04 40.00 2 2007/09/01 50.00 2 2007/09/02 60.00 2 2007/09/03 70.00 2 2007/09/04 80.00 3 2007/09/01 90.00 3 2007/09/02 100.00 3 2007/09/03 110.00 3 2007/09/04 120.00
12 rows selected
the grouping() operator allows various levels of totals to be distinguished from any other row in the result set.
For example:
SQL> select grouping(store_id), grouping(s.trans_date), store_id,
s.trans_date, sum(trans_amt)
2 from store s
3 group by rollup (store_id, s.trans_date);
GROUPING(STORE_ID) GROUPING(S.TRANS_DATE) STORE_ID TRANS_DATE SUM(TRANS_AMT)
------------------ ---------------------- ---------- ----------- -------------- 0 0 1 2007/09/01 10 0 0 1 2007/09/02 20 0 0 1 2007/09/03 30 0 0 1 2007/09/04 40 0 1 1 100 0 0 2 2007/09/01 50 0 0 2 2007/09/02 60 0 0 2 2007/09/03 70 0 0 2 2007/09/04 80 0 1 2 260 0 0 3 2007/09/01 90 0 0 3 2007/09/02 100 0 0 3 2007/09/03 110 0 0 3 2007/09/04 120 0 1 3 420 1 1780
16 rows selected
Will distinguish transation_date subtotals from store_id subtotals from grand totals.
This may provide the flexibility being sought. By using the above query in a subquery factoring clause, it is possible to query either the subtotal information or the detail information, or both in a single query.
hth Received on Tue Sep 04 2007 - 10:33:21 CDT
![]() |
![]() |