Re: Expense of 'over ... partition by'

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Fri, 05 Dec 2003 01:14:41 -0800
Message-ID: <F001.005D8D2A.20031205011441@fatcity.com>

Jared,

I think what you've discovered is just a repeat of the fact that different functionality is appropriate in different circumstances.

Imagine replacing your v$sql_workarea_histogram with a chunky SQL statement that crunched through a massive table producing a small result set.

In those circumstances, your analytic approach would sort a small set twice having done one big crunch. With the group by approach, you would have to crunch the big data set twice. I know which option would be cheaper. (You then have to wonder whether you could produce the small result set using subquery factoring 'with subquery' as another possible optimisation strategy).

BTW - did you notice how Oracle didn't do a sort for the order by in the GROUP BY example, because the optimizer could infer that the data had already been ordered by the GROUP BY ? That's the reason why your GROUP BY example did less sorting. (I'm not sure you need the GROUP BY, though I may be missing something).

BTW-2: in the analytic clause, the (partition by 1) is not necessary, you can write:
> , sum(optimal_executions) over ( )

Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

  The educated person is not the person
  who can answer the questions, but the
  person who can question the answers -- T. Schick Jr

One-day tutorials:
http://www.jlcomp.demon.co.uk/tutorial.html

Three-day seminar:
see http://www.jlcomp.demon.co.uk/seminar.html ____UK___November

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Original Message ----- To: "Multiple recipients of list ORACLE-L" <ORACLE-L_at_fatcity.com> Sent: Friday, December 05, 2003 12:24 AM

> While working on some scripts to monitor PGA usage on 9i, I came across
> something interesting while experimenting with different forms of SQL.
>
> I have recently been forcing myself to make use of 'OVER..PARTITION BY' in
> SQL so as to be more comfortable in using it. Can't add new tools to the
> box until I
> know how to use them. :) Yes, I know I should have been using them long
> ago.
>
> Anyway, I thought it might be interesting to compare the forms of SQL with
> and
> without the use of OVER...PARTITION BY.
>
> This SQL can be run on any instance that has PGA_AGGREGATE_TARGET set.
>
> Here is the SQL using OVER:
>
> select
> low_optimal_size_kb
> , high_optimal_size_kb
> , optimal_executions
> , onepass_executions
> , multipasses_executions
> , total_executions
> , optimal_executions / sum_optimal_executions * 100
> pct_optimal_executions
> from (
> select
> low_optimal_size/1024 low_optimal_size_kb
> , (high_optimal_size+1)/1024 high_optimal_size_kb
> , optimal_executions
> , onepass_executions
> , multipasses_executions
> , total_executions
> , sum(optimal_executions) over ( partition by 1 )
> sum_optimal_executions
> from v$sql_workarea_histogram
> where total_executions != 0
> ) a
> order by low_optimal_size_kb
> /
>
> and here is the SQL using good old GROUP BY
>
> select
> low_optimal_size_kb
> , high_optimal_size_kb
> , optimal_executions
> , onepass_executions
> , multipasses_executions
> , total_executions
> , optimal_executions / sum_optimal_executions * 100
> pct_optimal_executions
> from (
> select
> h.low_optimal_size/1024 low_optimal_size_kb
> , (h.high_optimal_size+1)/1024 high_optimal_size_kb
> , h.optimal_executions
> , h.onepass_executions
> , h.multipasses_executions
> , h.total_executions
> , hs.sum_optimal_executions
> from v$sql_workarea_histogram h,
> (
> select sum(optimal_executions)
> sum_optimal_executions
> from v$sql_workarea_histogram
> ) hs
> where h.total_executions != 0
> group by h.low_optimal_size/1024
> ,(h.high_optimal_size+1)/1024
> , h.optimal_executions
> , h.onepass_executions
> , h.multipasses_executions
> , h.total_executions
> , hs.sum_optimal_executions
> ) a
> order by low_optimal_size_kb
> /
>
>
> The new version is significantly simpler.
>
> It then seemed that it might be interesting to compare the performance and
> scalability of the two methods.
>
> This is where it gets interesting.
>
>
> 16:10:47 rsysdevdb.radisys.com - jkstill_at_dv03 SQL> @run_stats
>
> NAME RUN1 RUN2 DIFF
> ---------------------------------------- ---------- ---------- ----------
> LATCH.lgwr LWN SCN 1 0 -1
> LATCH.mostly latch-free SCN 1 0 -1
> LATCH.undo global data 1 0 -1
> STAT...active txn count during cleanout 1 0 -1
> STAT...consistent gets 5 4 -1
> STAT...db block gets 28 29 1
> STAT...enqueue requests 1 0 -1
> STAT...redo entries 17 18 1
> STAT...deferred (CURRENT) block cleanout 4 3 -1
> applications
>
> STAT...consistent gets - examination 1 0 -1
> STAT...cleanout - number of ktugct calls 1 0 -1
> STAT...calls to kcmgcs 7 6 -1
> STAT...calls to get snapshot scn: kcmgss 1006 1005 -1
> LATCH.Consistent RBA 2 0 -2
> STAT...recursive cpu usage 29 31 2
> LATCH.redo allocation 20 18 -2
> LATCH.cache buffers chains 102 105 3
> LATCH.redo writing 4 0 -4
> LATCH.library cache 2014 2008 -6
> LATCH.library cache pin 2012 2006 -6
> LATCH.messages 8 0 -8
> STAT...redo size 27096 27508 412
> STAT...sorts (memory) 1004 2004 1000
> LATCH.SQL memory manager workarea list l 0 2000 2000
> atch
>
> STAT...workarea executions - optimal 2008 4008 2000
> STAT...sorts (rows) 6112 10112 4000
>
> 26 rows selected.
>
> RUN1 is the the GROUP BY SQL
> RUN2 is the OVER...PARTITION BY SQL
>
> The OVER version of the SQL is significantly more expensive in terms of
> sorting and latching.
>
> Has anyone else noticed this?
>
> Or perhaps my use of OVER..PARTITION BY needs some optimization, which is
> clearly
> in the realm of possibility. :)
>
> Jared
>
>
>

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Jonathan Lewis
  INET: jonathan_at_jlcomp.demon.co.uk

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Received on Fri Dec 05 2003 - 03:14:41 CST