Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Help: large "cache buffers lru chain", large performace degradation
there are two processors in the machine, it is a digital
unix box with 2 GB RAM.
let me give a small history of what i have tried so far..
In the begining, the server was very slow and the requests were
getting
queued up. While trying to figure out which index or table was the
problem,
i got an error that the rollback segment cannot be extended. so i
added
a new datafile of 300m to the rbs tablespace. The server seemed to be
ok.
To analyze the indexes, i had run a small script and it showed which
indexes have to be rebuilt. While trying to rebuild the indexes, i got
errors that the indices tablespace did not have enough space. So i
created
a new tablespace and rebuilt the indexes in this tablespace. This did
not
improve things. Then i started monitoring v$system_event,
v$session_event
regularly.
one value that seemed to be directly proportional to the load on the
server
was
event total_wait total_timeout latch free 106372 99464
exploring further in v$latch_children gives
cache buffers lru chain
and
cache buffers chains
Then in a previous database report, it was suggested that the number
of
rollback segments can be increased to 4. Using the scripts in this
page
http://sure.org.ru/docs/database/ocp/new_page_19.htm, i found out that
there was contention for the rollback segment. This made me think that
the previous rollback segment fix was not a good one and i created
a new tablespace, added 4 rollback segments. The number of wraps keeps
increasing around 207
here is the rollback info
ORACLE instance started.
Total System Global Area 603534648 bytes Fixed Size 52424 bytes Variable Size 533813360 bytes Database Buffers 67108864 bytes Redo Buffers 2560000 bytesDatabase mounted.
2> ; USN EXTENTS RSSIZE WRITES XACTS GETS WAITS OPTSIZE HWMSIZE SHRINKS WRAPS EXTENDS AVESHRINK AVEACTIVE STATUS CUREXT CURBLK ---------- ---------- ---------- ---------- ---------- -------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- --------------- ---------- ----------
0 4 202752 4233 0 75 0 202752 0 1 0 0 0 ONLINE 3 0 2 121 16105472 22333150 0 114542 321 16105472 0 203 0 0 293129 ONLINE 57 51 3 121 16105472 22909513 0 111040 274 16105472 0 207 0 0 423778 ONLINE 69 40 4 121 16103424 22894978 0 113834 321 16103424 0 208 0 0 512043 ONLINE 106 57 5 121 16103424 34193430 1 117392 261 16103424 0 298 0 0 454377 ONLINE 84 18 1 3 397312 22015764 0 113503 214 397312 0 203 1 0 453831 ONLINE 1 20 6 3 397312 24444576 0 116405 246 397312 0 223 1 0 345838 ONLINE 2 41 7 3 397312 22971289 0 126445 281 397312 0 211 1 0 400960 ONLINE 2 64 8 2 264192 22582178 0 111884 251 264192 0 208 0 0 379549 ONLINE
I was thinking something like this
user1 reads block from disk -> updates something in it
user2 wants to use same block, oracle makes a copy, takes the last copy from the rollback segment.
user3 -> checks the user1 and user2 blocks, then creates his own but cannot take from the rollback segment?
Then i thought i will trace the buffer wait problem to the tables/
indexes and fix them. I ran a simple test with 30 users doing
a search repeatedly. The program first writes the search result
for each user into a "search result" table and then the rows are
taken from this table. So obviously the blocks in this table and
the primary key constraint index appeared in the traced list.
Exploring
further, i found that the free lists and pct free were the defaults.
i dropped the tables and recreated them with inittrans 5 and freelist groups 10. and presto, the buffer waits vanished but the latch free returned for "39" which is "library cache"! and lots of enqueue waits.
here are two samples.
latch free for "library cache" here
7 336 latch free address 30191864 0000000001CCB0F8number 39 0000000000000027tries 34 00000000000000220 1 WAITING
enqueue stuff like this
116 480 enqueue name|mode 1415053318 0000000054580006id1 458754 0000000000070002id2 3632 0000000000000E300 13 WAITING
and i have executed the stat report with timings.. this was before the block buffer size was increased.
Thanks
Nagarajan
SVRMGR> Rem ******************************************************************* SVRMGR> Rem Output statistics SVRMGR> Rem *******************************************************************SVRMGR>
Charwidth 12 SVRMGR> set numwidth 10 Numwidth 10
SVRMGR> select namespace library, 2> gets, 3> round(decode(gethits,0,1,gethits)/decode(gets,0,1,gets),3) 4> gethitratio, 5> pins, 6> round(decode(pinhits,0,1,pinhits)/decode(pins,0,1,pins),3) 7> pinhitratio, 8> reloads, invalidations 9> from stats$lib; LIBRARY GETS GETHITRATI PINS PINHITRATI RELOADSINVALIDATI
BODY 38548 .998 38549 .998 0 0 CLUSTER 0 1 0 1 0 0 INDEX 0 1 0 1 0 0 OBJECT 0 1 0 1 0 0 PIPE 0 1 0 1 0 0 SQL AREA 18024 .919 208913 .989 6 18 TABLE/PROCED 12397 .97 88362 .996 0 0 TRIGGER 645 .949 645 .933 0 0
Charwidth 27 SVRMGR> set numwidth 12; Numwidth 12
SVRMGR> Rem always be based on at least one logon. SVRMGR> select n1.name "Statistic", 2> n1.change "Total", 3> round(n1.change/trans.change,2) "Per Transaction", 4> round(n1.change/logs.change,2) "Per Logon", 5> round(n1.change/(to_number(to_char(end_time, 'J'))*60*60*24 - 6> to_number(to_char(start_time, 'J'))*60*60*24 + 7> to_number(to_char(end_time, 'SSSSS')) - 8> to_number(to_char(start_time, 'SSSSS'))) 9> , 2) "Per Second"
11> where trans.name='user commits' 12> and logs.name='logons cumulative' 13> and n1.change != 0 14> order by n1.name; Statistic Total Per Transact Per Logon PerSecond
--------------------------- ------------ ------------ ------------ ------------ CPU used by this session 70415 1.32 25.63 81.5 CPU used when call started 70415 1.32 25.63 81.5 CR blocks created 3020 .06 1.1 3.5 DBWR buffers scanned 80956 1.51 29.47 93.7 DBWR free buffers found 78866 1.48 28.71 91.28 DBWR lru scans 118 0 .04
.14
DBWR make free requests 118 0 .04
.14
DBWR summed scan depth 81456 1.52 29.65 94.28 DBWR timeouts 209 0 .08
.24
OS Block input operations 10539 .2 3.84 12.2 OS Block output operations 33458 .63 12.18 38.72 OS Integral shared text siz 472 .01 .17
.55
OS Integral unshared data s 1602772 29.98 583.46 1855.06 OS Integral unshared stack 10994 .21 4 12.72 OS Involuntary context swit 153551 2.87 55.9 177.72 OS Maximum resident set siz 4694360 87.8 1708.9 5433.29 OS Page reclaims 379858 7.1 138.28 439.65 OS Socket messages sent 19 0 .01
.02
OS System time used 9889 .18 3.6 11.45 OS User time used 61086 1.14 22.24 70.7 OS Voluntary context switch 153685 2.87 55.95 177.88 SQL*Net roundtrips to/from 44896 .84 16.34 51.96 background timeouts 802 .02 .29
.93
bytes received via SQL*Net 11938536 223.3 4346.03 13817.75 bytes sent via SQL*Net to c 50210545 939.13 18278.32 58114.06 calls to get snapshot scn: 181883 3.4 66.21 210.51 calls to kcmgas 55659 1.04 20.26 64.42 calls to kcmgcs 8614 .16 3.14 9.97 calls to kcmgrs 349281 6.53 127.15 404.26 change write time 7950 .15 2.89 9.2 cleanouts and rollbacks - c 1441 .03 .52 1.67 cleanouts only - consistent 1062 .02 .39 1.23 cluster key scan block gets 3695 .07 1.35 4.28 cluster key scans 1247 .02 .45 1.44 commit cleanout failures: b 10 0 0
.01
commit cleanout failures: c 154 0 .06
.18
commit cleanout number succ 114219 2.14 41.58 132.2 consistent changes 6130 .11 2.23 7.09 consistent gets 11555410 216.13 4206.56 13374.32 cursor authentications 11380 .21 4.14 13.17 data blocks consistent read 3934 .07 1.43 4.55 db block changes 600490 11.23 218.6 695.01 db block gets 709720 13.27 258.36 821.44 deferred (CURRENT) block cl 17285 .32 6.29 20.01 dirty buffers inspected 228 0 .08
.26
enqueue releases 114965 2.15 41.85 133.06 enqueue requests 114959 2.15 41.85 133.05 enqueue waits 506 .01 .18
.59
exchange deadlocks 8 0 0
.01
execute count 130775 2.45 47.61 151.36 free buffer inspected 20846 .39 7.59 24.13 free buffer requested 4858158 90.87 1768.53 5622.87 immediate (CR) block cleano 2503 .05 .91 2.9 immediate (CURRENT) block c 36198 .68 13.18 41.9 logons cumulative 2747 .05 1 3.18 logons current 2 0 0 0 messages received 5543 .1 2.02 6.42 messages sent 5543 .1 2.02 6.42 no work - consistent read g 9641086 180.33 3509.68 11158.66 opened cursors cumulative 38125 .71 13.88 44.13 opened cursors current 11 0 0
.01
parse count 39110 .73 14.24 45.27 parse time cpu 715 .01 .26
.83
parse time elapsed 5124 .1 1.87 5.93 physical reads 4846427 90.65 1764.26 5609.29 physical writes 17307 .32 6.3
20.03
process last non-idle time 1.326672E+12 24813844.03 482953102.02
1535500198.2
recursive calls 806630 15.09 293.64 933.6 recursive cpu usage 58029 1.09 21.12 67.16 redo blocks written 63276 1.18 23.03 73.24 redo entries 314687 5.89 114.56 364.22 redo size 59552578 1113.86 21679.13 68926.59 redo small copies 314687 5.89 114.56 364.22 redo synch time 2395 .04 .87 2.77 redo synch writes 1431 .03 .52 1.66 redo wastage 3764478 70.41 1370.4 4357.03 redo write time 7086 .13 2.58 8.2 redo writes 8930 .17 3.25 10.34 rollback changes - undo rec 126 0 .05
.15
rollbacks only - consistent 1532 .03 .56 1.77 session connect time 1.326672E+12 24813844.03 482953102.02 1535500198.2 session logical reads 12250719 229.14 4459.67 14179.07 session pga memory 2390209536 44706.06 870116.32 2766446.22 session pga memory max 2764165704 51700.47 1006248.89 3199265.86 session uga memory 14162600 264.89 5155.66 16391.9 session uga memory max 1754338024 32812.83 638637.8 2030483.82 sorts (memory) 5042 .09 1.84 5.84 sorts (rows) 6485615 121.31 2360.98 7506.5 summed dirty queue length 232 0 .08
.27
table fetch by rowid 819680 15.33 298.39 948.7 table fetch continued row 526501 9.85 191.66 609.38 table scan blocks gotten 6965747 130.29 2535.77 8062.21 table scan rows gotten 61451309 1149.37 22370.33 71124.2 table scans (long tables) 2225 .04 .81 2.58 table scans (short tables) 5796 .11 2.11 6.71 total number commit cleanou 118428 2.22 43.11 137.07 transaction rollbacks 62 0 .02
.07
transaction tables consiste 25 0 .01
.03
transaction tables consiste 2169 .04 .79 2.51 user calls 39491 .74 14.38 45.71 user commits 53465 1 19.46 61.88 write requests 243 0 .09
.28
SVRMGR> SVRMGR> SVRMGR> set numwidth 27 Numwidth 27
SVRMGR> select queue.change/writes.change "Average Write Queue Length" 2> from stats$stats queue, stats$stats writes 3> where queue.name = 'summed dirty queue length' 4> and writes.name = 'write requests';Average Write Queue Length
SVRMGR> SVRMGR> SVRMGR> set charwidth 32; Charwidth 32 SVRMGR> set numwidth 13; Numwidth 13
SVRMGR> Rem that event. SVRMGR> select n1.event "Event Name", 2> n1.event_count "Count", 3> n1.time_waited "Total Time", 4> round(n1.time_waited/n1.event_count, 2) "Avg Time" 5> from stats$event n1 6> where n1.event_count > 0 7> order by n1.time_waited desc; Event Name Count Total Time Avg Time -------------------------------- ------------- ------------- ------------- latch free 97674 492881 5.05 SQL*Net message from client 53100 219464 4.13 buffer busy waits 12324 201599 16.36 db file scattered read 567054 26453 .05 enqueue 507 14205 28.02 db file sequential read 664767 11308 .02 write complete waits 132 7829 59.31 log file sync 872 1994 2.29 SQL*Net more data to client 18860 237 .01 library cache pin 3 127 42.33 SQL*Net message to client 53136 28 0 buffer deadlock 8 18 2.25 SQL*Net break/reset to client 34 14 .41 control file sequential read 22 00
SVRMGR> SVRMGR> SVRMGR> Rem System wide wait events for background processes (PMON, SMON, etc) SVRMGR> select n1.event "Event Name", 2> n1.event_count "Count", 3> n1.time_waited "Total Time", 4> round(n1.time_waited/n1.event_count, 2) "Avg Time" 5> from stats$bck_event n1 6> where n1.event_count > 0 7> order by n1.time_waited desc; Event Name Count Total Time Avg Time -------------------------------- ------------- ------------- ------------- rdbms ipc message 4171 230769 55.33 pmon timer 416 86557 208.07 smon timer 2 60000 30000 db file parallel write 242 7936 32.79 log file parallel write 8930 7081 .79 latch free 1605 3786 2.36 log file sync 7 10 1.43 buffer busy waits 1 1 1 db file scattered read 8 0 0 db file sequential read 15 00
SVRMGR> SVRMGR> SVRMGR> set charwidth 18; Charwidth 18 SVRMGR> set numwidth 11; Numwidth 11
SVRMGR> Rem the 'latch free' event in the wait events above. SVRMGR> Rem Sleeps should be low. The hit_ratio should be high. SVRMGR> select name latch_name, gets, misses, 2> round(decode(gets-misses,0,1,gets-misses)/decode(gets,0,1,gets),3) 3> hit_ratio, 4> sleeps, 5> round(sleeps/decode(misses,0,1,misses),3) "SLEEPS/MISS" 6> from stats$latches 7> where gets != 0 8> order by name; LATCH_NAME GETS MISSES HIT_RATIO SLEEPSSLEEPS/MISS
------------------ ----------- ----------- ----------- ----------- ----------- cache buffer handl 408 0 1 0 0 cache buffers chai 29069122 14827 .999 40673 2.743 cache buffers lru 4876538 57462 .988 47918 .834 dml lock allocatio 116183 27 1 83 3.074 enqueue hash chain 230415 422 .998 690 1.635 enqueues 162670 64 1 210 3.281 ktm global data 2 0 1 0 0 latch wait list 5993 3 .999 0 0 library cache 1385850 3046 .998 5188 1.703 library cache load 938 0 1 0 0 list of block allo 111859 21 1 44 2.095 messages 69934 11 1 6 .545 modify parameter v 2747 1 1 1 1 multiblock read ob 1769220 2188 .999 1803 .824 process allocation 2746 0 1 0 0 redo allocation 335358 1235 .996 1753 1.419 row cache objects 216498 23 1 21 .913 sequence cache 8573 1 1 1 1 session allocation 27394 4 1 3 .75 session idle bit 86007 3 1 1 .333 session switching 79 0 1 0 0 shared pool 103739 57 .999 64 1.123 sort extent pool 2 0 1 0 0 system commit numb 424757 162 1 480 2.963 transaction alloca 174056 109 .999 214 1.963 undo global data 192102 100 .999 123 1.23 user lock 10980 3 1 3 1
Numwidth 16
SVRMGR> select name latch_name, 2> immed_gets nowait_gets, 3> immed_miss nowait_misses, 4> round(decode(immed_gets-immed_miss,0,1,immed_gets-immed_miss)/ 5> decode(immed_gets,0,1,immed_gets), 6> 3) 7> nowait_hit_ratio 8> from stats$latches 9> where immed_gets != 0 10> order by name; LATCH_NAME NOWAIT_GETS NOWAIT_MISSES NOWAIT_HIT_RATIO ------------------ ---------------- ---------------- ---------------- cache buffers chai 9421126 20572 .998 cache buffers lru 5170882 100371 .981 process allocation 2746 0 13 rows selected.
SVRMGR> Rem contention. SVRMGR> select * from stats$waitstat 2> where count != 0 3> order by count desc; CLASS COUNT TIME ------------------ ---------------- ---------------- data block 11683 193373 undo header 590 7653 segment header 36 190 undo block 16 4254 rows selected.
SVRMGR> SVRMGR> SVRMGR> set numwidth 19; Numwidth 19
UNDO_SEGMENT TRANS_TBL_GETS TRANS_TBL_WAITS UNDO_BYTES_WRITTEN SEGMENT_SIZE_BYTES XACTS SHRINKS WRAPS ------------------- ------------------- ------------------- ------------------- ------------------- ------------------- ------------------- ------------------- 0 5 0 0 202752 0 0 0 1 15918 14 3594697 264192 0 0 32 2 16734 36 3802967 16105472 -1 0 34 3 13664 27 2705746 16105472 0 0 24 4 13835 22 3041592 16103424 0 0 27 5 15564 21 2876084 16103424 1 0 26 6 14764 14 2951984 264192 0 0 27 7 16493 22 3637935 264192 0 0 32 8 14622 12 2769732 264192 0 0 25
Charwidth 39 SVRMGR> Rem The init.ora parameters currently in effect: SVRMGR> select name, value from v$parameter where isdefault = 'FALSE' 2> order by name; NAME VALUE --------------------------------------- --------------------------------------- always_anti_join NESTED_LOOPS audit_trail NONE background_dump_dest /oracle/app/oracle/admin/ibOnline/bdump cache_size_threshold 1638 checkpoint_process TRUE compatible 7.2.0.0 control_files /oracle/oradata/ibOnline/control01.ctl, core_dump_dest /oracle/app/oracle/admin/ibOnline/cdump cpu_count 2 db_block_buffers 16384 db_block_lru_latches 1 db_block_size 2048 db_file_multiblock_read_count 20 db_files 200 db_name ibOnline distributed_transactions 96 dml_locks 400 enqueue_resources 617 gc_db_locks 16384 gc_freelist_groups 50 gc_releasable_locks 16384 ifile /oracle/app/oracle/admin/ibOnline/pfile log_buffer 2560000 log_checkpoint_interval 3000000000 log_simultaneous_copies 0 max_dump_file_size 10240 mts_max_dispatchers 0 mts_max_servers 0 mts_servers 0 mts_service ibOnline optimizer_mode CHOOSE processes 260 remote_login_passwordfile NONE rollback_segments r01, r02, r03, r04, R01Jan12, R02Jan12, sequence_cache_entries 100 sequence_cache_hash_buckets 54 sessions 291 shared_pool_size 512000000 sort_area_retained_size 2048000 sort_area_size 2048000 sort_direct_writes AUTO spin_count 32 temporary_table_locks 291 timed_statistics TRUE transactions 384 transactions_per_rollback_segment 16 user_dump_dest
Charwidth 15 SVRMGR> set numwidth 8; Numwidth 8
2> where get_reqs != 0 or scan_reqs != 0 or mod_reqs != 0; NAME GET_REQS GET_MISS SCAN_REQ SCAN_MIS MOD_REQS COUNTCUR_USAG
dc_tablespaces 126 3 0 0 0 11 4 dc_free_extents 540 270 0 0 0 276 271 dc_segments 5925 338 0 0 0 375 374 dc_rollback_seg 36 0 0 0 0 13 10 dc_users 27009 6 0 0 0 10 7 dc_user_grants 17451 5 0 0 0 18 5 dc_objects 5810 376 0 0 0 471 467 dc_tables 7440 163 0 0 0 213 212 dc_columns 36610 924 4838 163 0 1389 1388 dc_table_grants 667 122 0 0 0 123 122 dc_indexes 3715 215 2377 150 0 263 241 dc_constraint_d 242 119 73 33 0 148 133 dc_constraint_d 17 11 36 29 0 12 11 dc_synonyms 286 46 0 0 0 51 46 dc_usernames 6266 3 0 0 0 12 4 dc_object_ids 7877 219 0 0 0 227 223 dc_sequences 147 3 0 0 142 9 3 dc_histogram_de 2926 163 0 0 0 171
163
18 rows selected.
SVRMGR> SVRMGR> SVRMGR> set charwidth 80; Charwidth 80 SVRMGR> set numwidth 10; Numwidth 10 SVRMGR> Rem Sum IO operations over tablespaces. SVRMGR> select 2> table_space||' ' 3> table_space, 4> sum(phys_reads) reads, sum(phys_blks_rd) blks_read, 5> sum(phys_rd_time) read_time, sum(phys_writes) writes, 6> sum(phys_blks_wr) blks_wrt, sum(phys_wrt_tim) write_time, 7> sum(megabytes_size) megabytes 8> from stats$files 9> group by table_space 10> order by table_space; TABLE_SPACE READS BLKS_READ READ_TIME WRITES BLKS_WRTWRITE_TIME MEGABYTES