Re: Tom Kyte's book obsolescence
Date: Sat, 2 Feb 2008 15:10:14 -0000
"Mladen Gogala" <mgogala_at_yahoo.com> wrote in message
> This is a longer post and I beg your forgiveness for that.
> Recently, I've had a complaint that analytic function is not executing
> very fast. The Oracle versions involved were 10.2.0.3 and 188.8.131.52. I
> wrote the same query as a self-join and, to my infinite surprise, the
> query was running faster then with the analytic functions. I remembered a
> book that says it shouldn't be that way.
It's your memory that is mostly at fault. The book says: "DEPENDING ON various factors, such as the size of the table and the number of rows you expect to get back, this MAY BE more efficient than making two passes to achieve the same goal".
> The book
> was "Effective Oracle
> by Design", by Tom Kyte. I created the structures, ran the queries and
> book was wrong on versions 10g and 11g. It was correct on the version 9i.
I just ran the test on 10g - to the extent of checking the execution plan
and noting that 10g does a HASH GROUP BY where 9i does a SORT GROUP BY to handle the unnested subquery. That's probably the reason - with this particular data set - why the subquery approach now loses the race. And bear in mind that the UNNEST is a cost-based decision which may not be applicable in arbitrary cases, and a hash group by
may not be possible when the UNNEST occurs.
> In other words, in the Tom Kyte's book "Effective Oracle By Design", the
> 2nd query on the page 517 is executing faster then the first one, the one
> with the analytic functions. Plans look very similar to what is in the
> book but the query that Tom claims should execute faster now executes
> This discovery is loaded with implications:
> 1) Oracle does I/O in versions 10g and 11g in the different way then it
> used to in versions 9i and below. I am not sure what exactly is the
> difference, but there is a difference.
If new versions did everything the same way as the old versions, they wouldn't be new versions, would they ;) But it's probably not the I/O (although there are some critical changes to tablescans) it's probably the execution plan. For idle interest, you might repeat the test with the parameter _gby_hash_aggregation_enabled set to false to see if the difference arises from the hash group by rather than anything else. I note that Tom's example showed only a couple of seconds elapsed time difference over 27 seconds - so the new 10g sort algorithm could be a relevant factor in the comparison. Comparsons of CPU and elapsed could also be appropriate.
> 2) Analytic functions are now just a neat trick, no longer a performance
> improvement. The cost of the window buffer sort is relatively high
> compared to the cost of simple join.
I think there's something in that chapter about the fact that the thing to watch out for with analytics is the amount of sorting. In passing, 10.2 can convert a query with nested aggregate subquery into a join with an analytic function, a feature enabled (if I recall correctly) by the default setting of parameter _remove_aggr_subquery. So analytic functions are, sometimes, a performance enhancer as far as the optimizer group are concerned.
> 3) Some information in "Effective Oracle by Design" is obsolete. How
(a) Absolutely true, in detail (b) hard to say. On the other hand, when you can run the test cases and see what's changed (e.g. the new execution plan looks different from the one produced in Tom's chapter) you waste less time trying to solve problems, and discover that the book has usually been telling you the general principle that is still correct, with a specific demonstration that has become obsolete.
-- Regards Jonathan Lewis http://jonathanlewis.wordpress.com Author: Cost Based Oracle: Fundamentals http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.htmlReceived on Sat Feb 02 2008 - 09:10:14 CST