Re: gc buffer busy and rac testing

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Thu, 8 Jan 2009 13:05:20 +0000 (UTC)
Message-ID: <gk4tmg$nui$3_at_solani.org>



On Thu, 08 Jan 2009 12:35:52 +0100, helter skelter wrote:

> Hi,
>
> I'am testing 2 node RAC but I haven't got too much experince with this
> technology and I am little puzzled after test results. I used Swingemch
> (order_entry test) and when I ran this test on 2 nodes RAC performance
> is much lower then on only one node of this cluster.
>
> TPS
> 2 nodes - 324
> 1 node - 473
>
> TMP
> 2 nodes - 17571
> 1 node - 25272
>
> When I look at waits I see (2 node test) that 60% are "gc buffer busy".
> Isn't it too much ? any advices what more can I check?
>
> Tablespace is localy managed, with auto segment space management and
> system allocation type. Interconnect on dedicated 1GB ethernet. Oracle
> 10.2.0.4 (standard edition) on RHEL5 with asm.
>
> thakns

GC_BUFFER_BUSY is a RAC event meaning that your process has to wait for the other instance to finish processing the requested block. When user process reads a block from the database, it is cached in SGA. When transactions query the table that contains the block, consistent versions of the block are created in SGA. There can be many consistent versions but there can be only one current version of the block, for the entire database. That, in particular, means that no matter how many instances you have, only 1 of their SGA's will contain the current version. If other instances are in need of the same block, they will be waiting on "GC BUFFER BUSY" event.

Good news is that there is a cure. Bad news is that the cure involves the application restructuring. The cure is called "functional partitioning". Essentially, one ought to make sure that objects are updated from one node only. You can update 2 tables from 2 nodes, one can even update the same table from 2 nodes, but one should never update the same block from 2 nodes. The most common trick to achieve that is to create services by using DBMS_SERVICE and tie them to the particular instance. Those services are then used to connect to the database and update it.

There is a ground-breaking book about RAC by K. Gopalakrishnan, which explains all the gory details that you'd like to know and many that you wouldn't have dreamed of. The book is called "Oracle Database 10g Real Application Clusters" and can be found here: http://tinyurl.com/7thc96 The author also occasionally partakes in the discussions on this group.

The thing to try would be to restrict the Swingbench program to the 1st node and run queries and reports from the 2nd node. That would be a crude version of the functional partitioning. Also, please remember that RAC is a redundancy and availability option, not a performance option. One big blue P6 box will run circles around 20 Dell PC's with quad-CPU boards and 64GB each. On paper, RAC with Dell boxes will be much faster, but in reality, one small P-595 with 32 cores and 128 GB RAM will run circles around any Intel-based configuration for OLTP. You will also get a decent Unix, as opposed to Linux and very nice deep-blue and black boxes quietly churning away without causing you any trouble at all. The secret is that HAL 9000 was an Intel box.

-- 
Mladen Gogala
http://mgogala.freehostia.com
Received on Thu Jan 08 2009 - 07:05:20 CST

Original text of this message