Re: 8.1.7 cache buffer chains contention

From: Scott Gamble <zifnab_at_NOSPAM.reddragon.org>
Date: Fri, 24 Aug 2001 18:49:26 GMT
Message-ID: <Wkxh7.15567$e8.3871036@e3500-chi1.usenetserver.com>

"Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote in <998676977.23507.0.nnrp-14.9e984b29_at_news.demon.co.uk>:

>The reason why you get the problem when
>moving from 8.0.6 to 8.1.7 is that the newer
>version of Oracle allows for different execution
>paths - sometimes even doing things differently
>whilst reporting the same execution path.
>
>I think someone has already pointed out how to
>identify the one hot block - and you said you
>were confident it was the root of an index -
>would this be one on TRIL_JOIN_WRK ?
>or on an index in the first large query in the
>v$sql list. Why, we also wonder, are there so many
>versions of those two queries loaded ?

Yes its the first query that starts with select tril_adjustments. The version counts have always been high on all of these, its an issue we have tried to address with the vendor.

>>My first guess is that the issue is related to
>8.1.7 taking an index fast full scan path
>when it used to take an index scan path -
>possibly as the inner table in a nested loop:
>the problem is that a tablescan or index FFS
>does 4 current gets on the segment header block
>(Are you sure it was the index ROOT and not
>the index segment header block ?)
>

The plan itself didnt change going into 8.0.6 but based on what you said above that may not matter? It may have changed the way its doing things anyway?

The block on that latch is 153603, according to dba_segments the header block is at 153602 for that index, I did make an assumption that the second block was the header.

>
>Another area of suspicion is in the read-consistency
>cycle - the average transactions applies 895 rollback
>changes - that's quite a lot, whilst making 274 forward
>block changes to 151 blocks. Your transactions seem
>to be colliding rolling back each others changes to
>get a consistent view of the data.
>

I mentioned that we have about 500 sessions, there are a handful of appservers(7-10) that are used to provide the connection. Each of those appservers connects to a specific user. I am not sure if this is a self imposed limitation (one database user per app server or not).

The tril_joins_wrk you mentioned below is basically a temporary work table each of the database users above owns their own copy of that table that is later joined to another table to return the rows needed, then those rows are deleted from the tril_joins_wrk.

So there are say 50 users sharing that table at one time all inserting /deleting from that table. (this is one point where we thought temp tables/tablespaces could really help).

>Your big problem is 'cache buffers chains' but has
>Oracle said anything about 'cache buffers handles'
>which you also miss on - more so that 'chains'
>but tend to get by spinning ?
>

Yes we mentioned that to them the analyst was researching that and couldn't tell us why that was a problem, he did mention they upped the default parameter for that from 3 to 5 (whatever that means).

>Most of the cache buffers chains misses are related
>to "kcbgtcr: kslbegin" which I guess is 'get consistent read,'
>rather than "kcbgcur: kslbegin" which is probably 'get
>current read' - so that kicks my guess about segment
>header blocks and fast full scans into touch.
>
>
>Final thought - your idea of a hash table is possibly
>going to work. Since you know which index is the
>critical one, create an 8-partition hash table from the
>offending table, partitions by the first column of that
>index. If the queries causing the problem are able
>to do partition elimination, then on average each
>root block will be requested one-eigth of the time,.
>and the drop may be sufficient to make a MUCH MORE
>significant improvement on the queue sizes than the
>factor of 8 would suggest.
>
>However, if partition elimination cannot take place,
>then every partition is queried, and you get 8 times
>the latch delay.
>
>
>I chose 8 as a number small enough to avoid extreme
>delays due to the possibility of other work having to
>check 8 partitions all the time, traded off against the
>benefit of reducing the latching.
>
>
>
>The only other thing I'd say is that the first two queries
>on the SQL list are doing a lot of work - forget the latch
>issue for a bit, and try to find out if those queries can be
>made to do a lot less work: fix those, and perhaps your
>latch problem will cease to exist.
>

Unfortunately as I said in another post its a 3rd party application that we may not be able to convince them to change. There is an option that avoids using that 'temporary' table for the joins though, and that may be an option to try.

Thanks to everyone for their feedback. I have received more details from this group than oracle gave back to us and it sure helps the learning process.

>
>--
>Jonathan Lewis
>
>Host to The Co-Operative Oracle Users' FAQ
>http://www.jlcomp.demon.co.uk/faq/ind_faq.html
>
>Author of:
>Practical Oracle 8i: Building Efficient Databases
>See http://www.jlcomp.demon.co.uk/book_rev.html
>
>For latest news of public appearances
>See http://www.jlcomp.demon.co.uk
>
>Screen saver or Life saver: http://www.ud.com
>Use spare CPU to assist in cancer research.
>
>
>
>
>
>
Received on Fri Aug 24 2001 - 13:49:26 CDT