Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Checkpoints - a bottleneck. Asking for tuning advise

Checkpoints - a bottleneck. Asking for tuning advise

From: Brian Rasmusson <brasmuss_at_bellesystems.com>
Date: 2000/03/12
Message-ID: <38CC014C.1ACD90E@bellesystems.com>

Dear all,

we're at the moment doing some benchmarking of a real-time application making heavy amounts of inserts and updates. We are talking hundreds of each per second, and we would do quite good if it wasn't for our problems
with checkpoints taking too long.

Brief application overview



A session is considered to contain a "state query" and a "session complete" event. There is no way to combine the two, as they are actually
two transactions in the system. The system is a real-time system, meaning
that requests must be handled instantly.

The "state query" event looks up data from a table consisting of 4 columns,
to simplify lets call it the STATE table in the rest of the mail.

The "session complete" event updates data in the STATE table, and inserts
data in the COMPLETE table and in the LOG table. The COMPLETE table contains approx. 15 columns, and the LOG table approx. 8 columns.

On a Sun Enterprise 4500 with 6 400Mhz CPUs, Solaris 2.7, Oracle 8.0.5, 26 18Gb disks and 3Gb RAM we have reached about 220 sessions per second (that is, 220*"state query" + 220*"session complete" events).

STATE table contains 5.000.000 rows.

COMPLETE and LOG tables contain 1 row per session. They cannot be combined
to 1 table.

So what's the problem?



"220 sessions sounds quite good!" I hear you say. Yes, it's pretty good, but
not good enough. Our application can handle much more, I/O wait on data disks
is low. The bottleneck seems to be checkpoints. I/O goes up heavily, and ofcourse this stalls the rest of the system for a few seconds. Not much, but
way too much when you are running a real-time system.

We are forced to stick with Oracle 8.0.5 for now.

What have we done so far?



Quite a lot. First we ofcourse optimised our application to use the database
as little as possible, meaning implementing our own caches etc. What we have
done more includes:
  1. The STATE table is kept in buffer pool (which btw. is 800Mb)
  2. Checkpoints are set to take place every 20 seconds
  3. STATE table is partitioned across 5 disks (1.000.000 rows per partition)
  4. STATE table has no primary key, but a UNIQUE LOCAL index partitioned across 5 disks.
  5. COMPLETE table is partitioned across 5 disks
  6. COMPLETE table has no primary key, but a UNIQUE LOCAL index partitioned across 5 disks
  7. LOG table stores data in partitions that each contains 1 months data. Must be kept like this. Partitioning of this table only benefits the database maintenance, not very much the performance.
  8. Each partition has its own tablespace
  9. The two rollback segments each have their own tablespace and own disk.
  10. The redo logs are placed on two disks each with the forcedirectio setting enabled in Solaris.
  11. One db_writer is enabled per CPU

Dumps from the system


initsid.ora settings:

rollback_segments=(r1,r2)
db_files = 100
open_cursors = 256
db_file_multiblock_read_count = 8

db_block_buffers = 200000			## shared mem 2 GIG
buffer_pool_keep = 100000			## 800M

db_writer_processes = 7
db_block_lru_latches = 7
shared_pool_size = 400M
log_checkpoint_interval = 200000	## every 50M do checkpoint
log_checkpoint_timeout = 20		## every 20sec 
processes = 300
dml_locks = 500
log_buffer = 262144
log_small_entry_max_size=32 		##
log_simultaneous_copies=7		##
sequence_cache_entries = 120

sequence_cache_hash_buckets = 120
max_dump_file_size = 204800
timed_statistics = TRUE
optimizer_mode = FIRST_ROWS
global_names = false
# Sort
sort_area_size 		= 1048576		# 1M
sort_area_retained_size = 131072		# 128k
sort_direct_writes 	= true
sort_write_buffer_size 	= 32768
sort_write_buffers 	= 1

# SNP background processes (job executers)
job_queue_processes = 2 job_queue_interval = 30

v$system_event:

EVENT                         
AVERAGE_WAIT                                                                                                                                                             
------------------------------
------------                                                                                                                                                             
Null event                             
401                                                                                                                                                             
SQL*Net break/reset to client   
.038461538                                                                                                                                                             
SQL*Net message from client     
.444446547                                                                                                                                                             
SQL*Net message to client       
.000453161                                                                                                                                                             
SQL*Net more data from client   
.003768747                                                                                                                                                             
SQL*Net more data to client              
0                                                                                                                                                             
buffer busy waits               
.285714286                                                                                                                                                             
control file parallel write     
.775605806                                                                                                                                                             
control file sequential read    
.077059827                                                                                                                                                             
db file parallel write          
12.0203044                                                                                                                                                             
db file scattered read          
.332503556                                                                                                                                                             
db file sequential read         
1.95008266                                                                                                                                                             
direct path read                
.437058446                                                                                                                                                             
direct path write               
.005569007                                                                                                                                                             
enqueue                                  
3                                                                                                                                                             
file identify                   
.012211669                                                                                                                                                             
file open                       
.007784855                                                                                                                                                             
instance state change                    
0                                                                                                                                                             
latch free                               
1                                                                                                                                                             
library cache pin                        
5                                                                                                                                                             
log buffer space                        
13                                                                                                                                                             
log file parallel write         
.273877557                                                                                                                                                             
log file sequential read        
.682857143                                                                                                                                                             
log file single write           
.178932179                                                                                                                                                             
log file switch completion      
6.97971014                                                                                                                                                             
log file sync                   
.188444236                                                                                                                                                             
pmon timer                      
301.000819                                                                                                                                                             
process startup                 
11.7857143                                                                                                                                                             
rdbms ipc message               
98.3071415                                                                                                                                                             
rdbms ipc reply                 
28.5929025                                                                                                                                                             
refresh controlfile command     
.044117647                                                                                                                                                             
smon timer                      
28708.3673                                                                                                                                                             
sort segment request                   
101                                                                                                                                                             
write complete waits            
12.1792846                                                                                                                                                             


v$waitstat:

CLASS                   COUNT      
TIME                                                                                                                                                                
------------------ ----------
----------                                                                                                                                                                
data block                  4         
1                                                                                                                                                                
sort block                  0         
0                                                                                                                                                                
save undo block             0         
0                                                                                                                                                                
segment header              0         
0                                                                                                                                                                
save undo header            0         
0                                                                                                                                                                
free list                   0         
0                                                                                                                                                                
extent map                  0         
0                                                                                                                                                                
bitmap block                0         
0                                                                                                                                                                
bitmap index block          0         
0                                                                                                                                                                
unused                      0         
0                                                                                                                                                                
system undo header          0         
0                                                                                                                                                                
system undo block           0         
0                                                                                                                                                                
undo header                 3         
1                                                                                                                                                                
undo block                  0         
0                                                                                                                                                                


We also tried the hint about letting another process keep the logfiles open, but it seemed to have no effect at all?

Any input on how to improve the checkpoint performance will be greatly appreciated. More info can be provided if required.

Thanks in advance,
Brian Received on Sun Mar 12 2000 - 00:00:00 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US