Re: Estimating effectivity of performance improvement measures

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Sat, 29 Nov 2008 11:16:31 -0000
Message-ID: <9N2dneDM6KkQuazUnZ2dneKdnZydnZ2d@bt.com>


"Robert Klemme" <shortcutter_at_googlemail.com> wrote in message news:795549d2-4377-4597-9d84-fa24a05a0317_at_j11g2000yqg.googlegroups.com...
> Today I have another question regarding estimating effectivity of
> changes.
>
> Situation: we have a query that accesses four tables, which result in
> 18 buffer gets per execution and 3 physical reads. The plan looks
> pretty good given the schema and query. This query is responsible for
> more than 30% of IO of this instance (10.2.0.2.0). (I'll leave out
> details for now because my main point is independent of the specific
> SQL statement and schema and the posting is going to be lengthy
> anyway.)
>
> Obviously I would like to try to optimize this if possible. I took a
> copy of this production database and tested different approaches
> (adding an index which will make one table superfluous, partitioning,
> creating a smaller copy of the table as IOT which contains only data
> relevant for this query). Plans that came out of this looked pretty
> good as well - and quite similar to the original. Most significant
> improvement I could get is to reduce the # of consistent gets to 15
> (down from 18).
>
> I am figuring, this will _only_ improve the situation in the
> production database if the set of blocks that this query potentially
> accesses will be reduced significantly (leading to less physical IO on
> average because of better cache utilization). Now the tricky part:
> what would you do to estimate the effects of those different
> approaches listed above on the production database?
>
> My idea so far is this: on the test database recompile all indexes and
> maybe reorganize all tables to get all into a defined shape. Then
> look at the sum of the number of blocks all involved objects use up
> per solution implemented. Pick the one with the smallest number.
>
> The drawback of this is of course that production activity will change
> the block count of different objects in different ways (e.g. block
> usage patterns of tables differ significantly from those of BTree
> indexes) and so the block counts after reorganization will only give a
> weak indication of the real numbers after a while of activity.
>
> This is even more the case as part of these tables involve historic
> data, i.e. multiple values with different time ranges of validity,
> which makes a certain amount of data dead - from _this_ query's point
> of view. So the way how this "dead wood" is distributed across
> objects may have a significant impact (for example when using
> timestamps as leading columns for an index vs. trailing columns or
> having them in the table only).
>
> All this does not become easier through the effects of _other_ DML
> running in parallel which compete for buffer cache space. For a
> realistic test 11g's Real Application Testing would come in handy
> where you can capture a particular load and replay it against the test
> database. Unfortunately I cannot simulate the load right now because
> 11g is not installed and is not an option, load is highly client
> application and traffic dependent, both things outside of my control
> and creating a load capturing and replay framework would take a
> significant amount of time.
>
> Now, what do you think? What other options do I have? Did I overlook
> something? Thanks in advance!
>
> Kind regards
>
> robert

Robert,

I think you need to read Chapter 1 of volume 2 of "Cost Based Oracle" - but I'm still writing it, so you will have to wait a bit.

You're obviously thinking along the right lines - the question of how you decide the potential effectiveness, though, has to involve a component of understanding of the requirement.

Taking your requirement:

    4 table, 18 block visits, 3 physical reads.

Assume unique indexed access to a table takes three index block visits and one table visit - that's 16 blocks in your case. For reasonably large tables it's not unreasonable to find that each table block visit has a very high chance of being a physical read.

So - why have you got 2 "extra" block visits, and where do you "lose" one physical read ?

The extra block visits might be undo block visits for read consistency during periods of concurrent update - and that's an issue that's going to make it hard to test the effects of any changes properly.

The extra block visits may be due to index range scans that leave you acquiring and discarding excess rows from tables - and each extra row might mean an extra block visit.

The extra blocks may be due to large index range scans that end up returning just one appropriate rowid, so the extra work is done only in the index (and if that's the case, then this query could also be putting an undue workload on your CPU).

The "lost" physical read might be because you have a table which is so small and popular that it's always 100% buffered. You might even have three such tables - and be able to see that all the I/O comes from (say) one very big index that doesn't stay buffered, plus two physical reads on the last table.

  • and so on.

You may be able to get some idea of the "object-level cache hit ratios" from v$segstat - which can give you an idea of where you are most likely or least likely to gain some I/O benefit by restructuring - but mostly you just have to say:

    this is the amount of data I HAVE to visit     this is where I can find it in the least number of blocks     this is the number of blocks I will almost certainly have to get off disc

    this is how I can probably keep the rest of the blocks buffered

In your specific case, simply reducing 18 block visits to 15 may make virtually
no difference to performance because the three blocks you have eliminated may have been blocks that would have been buffered blocks anyway. But if you can say "I have avoided one trip to a big table, and my knowledge of the application (or at least v$segstat) tells me that this will usually mean
one less physical read you're heading in the right direction.

Ultimately, to do less work, you have to "co-locate" the data you want (and this usually means more work on the inserts, and can mean more undo block visits for read-consistency), or you have to avoid visiting date that you don't really need (and this usually means adding columns to indexes - which can make the indexes larger, need more space in the buffer to stay cached, and become subject to more undo block visits for read-consistency reasons).

-- 
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Author: Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html
Received on Sat Nov 29 2008 - 05:16:31 CST

Original text of this message