Re: Oracle RAC - Any formula to calculate/preempt the right number of LMS processes?

From: PD Malik <>
Date: Wed, 22 Jun 2011 12:52:24 +0100
Message-ID: <>

Riyaj & Gopal - Thanks for responding to my query.

Riyaj Bhai - You are an expert - no denying there - for yourself you may have the entire ocean to cover for people like me who are still learning to swim - you are an expert :-)

Just to answer your & Gopal's Que on RT to start with : We are on 10gR2 on AIX 6 and LMS is in _high_priority_processes and _os_sched_high_priority is also set to default 1 so I'd assume that it is set on high priority.

Gopal, Is this somehow can be done in AIX as well pls do you know? "One of the most common practices is to schedule the number of RT processes < # of CPUs to avoid spin/hang on CPUs." and also how to check whether these settings are working fine or not?

A bit more explanation of our CPU behviour (very broadly) - We've seen that in past some months our CPU usage has gone very high compared to the incremental user load on the system and our capacity management departement has highlighted LMS being one of the top consumers (amongst other candidates). Its not that it always used to behave earlier no apparanely its always been bad and has come to our attention now because of this CPU issue.

While also looking at other things it seems that this LMS setting is anyway too high and looks like Oracle is 'correcting' it in 11g so this is something we shuold look at doing anyway.

I had a bit of google digging on this CR Fabrication lead that you provided (Thanks to Gopal and his answer he posted in one of the forums) and this is probably you meant :

SQL> select inst_id, cr_requests, light_works, data_requests, fairness_down_converts, (fairness_down_converts/data_requests)*100


---------- ----------- ----------- ------------- ----------------------
         2    58831118    74411475      58831114                9720672
         1   350485652   884185071     350485608               59529981
         6    14175101    20025830      14175101                2019579
         4    13457882    19142318      13457880                1960022
         3   440964324   909842437     440964274               79782554

And the ratio looks OK to me (less than 40% - even though aggregated but still indicative I think) and it seems there isn't much of CR Fabrication going on BUT I may be looking totally at wrong thing and if thats the case then I'd request if you can point me to any writeups/whitepaperts etc. that I can go thru to understand it a bit better - that'll be a great help.

I know, to know RAC I'll need to read Gopal's book which is already in my To Do list but unfortunately thats not an overnight thing.

So our current approach is take these new settings (5 most probably) thru a Performance Test and see how these settings are bahiving by looking and gc waits and stats. I know its not the best approch as its 'trying to fix' without getting to the root cause (due to lack of total mastery in this area) but hopefully a performance test will give us a good confidence. Any comments about our approach pls?

The only other thing that I've been thinking is that in 11g as you know remastering is at file level instead on object level so is it that this file block remastering is more 'efficient' than object remastering hence Oracle has massively downsized the formula to calculate the default number of LMS? This along with other optimizations in building/pinning/sending a block etc. in 11g might have reduced the pressure on LMS hence the lower number may be OK in 11g but it may not be a good idea in 10g yet? Just thinking ...

Sorry this email has gone a bit long ..


On Tue, Jun 21, 2011 at 5:14 PM, Riyaj Shamsudeen <> wrote:

> Hello Malik
> I am no RAC expert, seems like a ocean more to learn.
> I would agree with KG's comments with one enhancement. I would go with an
> odd number, 3 or 5, prior to 11g version. There is an internal bug (10.2)
> where the distribution of workload is non-uniform if # of LMS processes is
> an even number. ( I know, it is weird, but I don't have any more details,
> can't reproduce it, a RAC developer told me this few years ago, and seems
> like it is worth a risk to take making it a prime number. This comment does
> not apply to 11g and above. )
> As you have figured out, 11.2 by default, sets this parameter to more
> meaningful value. If you are looking for an algorithm, why not use 11.2
> default algorithm, then? In this case, for 80 CPUs, formula will be
> 2+(80/32)= 4.5, round that up to 5.So, set gcs_server_processes to 5.
> Remember that just because LMS is running in RT priority (Solaris lingo)
> doesn't mean that it consumes CPU. RT simply bumps the LMS process to the
> front of the CPU run queue for immediate scheduling, if LMS needs to use
> CPU. You should also review to understand why LMS consuming CPU? CR
> fabrication or Current mode buffer creation (most common reasons) or is it
> due to some other reason?
> Cheers
> Riyaj Shamsudeen
> Principal DBA,
> Ora!nternals - - Specialists in Performance,
> RAC and EBS11i
> Blog:
> OakTable member
> Co-author of the books: Expert Oracle Practices<>,
> Pro Oracle SQL, Expert PL/SQL Practices<>
> On Tue, Jun 21, 2011 at 10:02 AM, K Gopalakrishnan <>wrote:
>> Malik,
>> Are you running them in real time priority? If yes, you can reduce it to 4
>> or (# of instances-1) and monitor the latencies of the RAC wait events/CPU
>> usage of the LMS processes. AFAIK there is no magic formula for
>> calculating the # of LMSs.
>> One of the most common practices is to schedule the number of RT processes
>> < # of CPUs to avoid spin/hang on CPUs. If your current settings works fine
>> for your workload, I would stick to it.
>> -Gopal
>> On Tue, Jun 21, 2011 at 6:02 AM, PD Malik <> wrote:
>>> Hello,
>>> We are on 10gR2 5 node RAC (a huge and very busy DB) and also very high
>>> number of CPUs (80ish) hence the default number of LMS processes that Oracle
>>> generates for us (num_cpu dependant) is way too high and is causing CPU
>>> consumption issue (due to their high priority) for some time now so we need
>>> to look to reduce (and fix) them to a sensible number and going by RAC
>>> expert Riyaz's advice we shud reduce them down to 4 (one for each 'other'
>>> RAC node) which makes sense to us particularly becs if we were going 11g and
>>> left this parameter to Oracle's default calculation it'll come down to 3.
>>> (Current number of LMS processes : (Due to the varied number of CPUs ) - On
>>> 2 nodes, 10, on 1 12 and on the other 2 : 8).
>>> Moreover, we've done the analysis and there are no major RAC wait events
>>> (as in we are not much suffering from RAC) and we "think" we should be fine
>>> by reducing these and obviously we'll take them thru to Performance Test to
>>> gain some confidence.
>>> However the question is, is there any other mathematical way of getting
>>> to just the right number of LMS processes for a RAC instance pls? Is there
>>> any formula to predict them or even to predict if there will be any latency
>>> impact by reducing these?
>>> Thanks in advance.

Received on Wed Jun 22 2011 - 06:52:24 CDT

Original text of this message