Re: log file sync (now EBS, RAC, PARALLEL_MAX_SERVERS)

From: Kevin Closson <ora_kclosson_at_yahoo.com>
Date: Fri, 24 Aug 2012 09:26:08 -0700 (PDT)
Message-ID: <1345825568.63144.YahooMailNeo_at_web161706.mail.bf1.yahoo.com>


For what it is worth, Application Affinity is precisely the method Oracle uses on large NUMA configs when running audited TPCC (like the recent x4800 8S 5 million TpmC result). There is a listener per socket (or a small sub-set of sockets with 1 hop memory) and then tuxedo is used to funnel specific transactions to specific connected foregrounds. For instance, New Order might be serviced by a set of foregrounds with hard affinity to sockets 0-3 but Payment and Delivery are sockets 4 and 5 respectively. You can implement the same sort of thing with TNS services and numactl (or similar NUMA affinity OS tools on Unix). 

BTW, keep an eye out for the CGROUPs integration in 11.2.0.3 which will allow one to implement "instance caging" if one feels compelled to use these large multi-hop NUMA boxes (ala Sun Server X2-8 which is formerly known as the x4800).

It's always better to keep processes that have significant resource intersection running on CPUs connected with the lowest level of latency on these multi-hop systems.



 From: Mark W. Farnham <mwf_at_rsiz.com> To: Amir.Hameed_at_xerox.com; ora_kclosson_at_yahoo.com Cc: oracle-l_at_freelists.org
Sent: Friday, August 24, 2012 8:06 AM
Subject: RE: log file sync (now EBS, RAC, PARALLEL_MAX_SERVERS)  

Yes, that is what I mean by application affinity.

One way to honor application affinity for parallel query is PARALLEL_FORCE_LOCAL=TRUE. If you have established application affinity for an EBS RAC installation, that is likely a good setting with a modest number of nodes. For a large number of nodes you may need to INSTANCE_GROUPS and PARALLEL_INSTANCE_GROUP on 10g or refer to the workload management section on 11g and up. But you reported having only three nodes. I suppose you could have something like GL primary executing on one node and have, say, manufacturing primarily on the other two nodes.

The reason I write something like "configure so that parallel queries do not cross node boundaries" is they can change the parameters and methods. The concept of application affinity does not change.

Please do not omit my "if" and "likely" lest this be interpreted as a silver bullet. Fortunately the parameter is available both as alter system and alter session, so you can control the default for EBS to be a single node yet run individual jobs on all the machines if the situation calls for that. This seems like a reasonably targeted solution to the problem you have reported. (ugh - I used a bullet metaphor.)

Regards,

mwf

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Hameed, Amir
Sent: Friday, August 24, 2012 8:51 AM
To: Mark W. Farnham; ora_kclosson_at_yahoo.com Cc: oracle-l_at_freelists.org
Subject: RE: log file sync (now EBS, RAC, PARALLEL_MAX_SERVERS)

The idea is to maintain affinity for parallel processes by limiting them to the node from where the (parallel) query was started. This will also help reduce traffic across the interconnect. The concurrent managers will be 'partitioned' so that all connections from a certain Concurrent Processing node are routed to  a specific RAC node and would not get load-balanced to multiple RAC nodes.

-----Original Message-----

From: Mark W. Farnham [mailto:mwf_at_rsiz.com] Sent: Thursday, August 23, 2012 3:11 PM
To: ora_kclosson_at_yahoo.com; Hameed, Amir Cc: oracle-l_at_freelists.org
Subject: RE: log file sync (now EBS, RAC, PARALLEL_MAX_SERVERS)

Youch!

If you really can't fit your processing need on a single node, then you usually need to set up application affinity for EBS to fight less with RAC architecture. So you probably do not want to allow parallel queries to cross nodes. (You might I suppose if you have a blackout period to run certain jobs against nothing else, the archetype being open GL period, but even then it is not a certainty that will run faster on multiple nodes than on a single node.)

So unless you've got a really good reason you should configure so that parallel queries on EBS systems cannot cross node boundaries.

Next, with the concurrent manager being able to schedule jobs so that you can eat up pretty much all the resources on all the nodes, there is a real question of overall throughput about whether you want any single job to run in parallel ever. (Please notice I wrote "real question" and that a real world load mix and real world demands for some individual job to complete more quickly may mean allowing it to run in parallel justifies sacrificing overall throughput. Keeping some broad resource hog out of "prime time" [if
in this increasingly global world you still have a "prime time"] might also justify running some job in parallel even if the consumer does not need it in a hurry. This tends to be true of large update batch jobs, so if you can "de-heat" the undo for when routine queries are heavier you can reduce the total work required to execute the overall load.)

Good luck.

mwf

-----Original Message-----

From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Kevin Closson
Sent: Thursday, August 23, 2012 1:49 PM
To: Amir.Hameed_at_xerox.com
Cc: oracle-l_at_freelists.org
Subject: Re: log file sync

Now I am working on addressing the "direct path read " event which is coming from a standard EBS statement that contains 'parallel' clause without any DOP which ends up invoking 144 parallel processes on three RAC nodes (PARALLEL_MAX_SERVERS is set to 48 on each RAC node).

... Eeek...taming the beast as it were.

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l
--

http://www.freelists.org/webpage/oracle-l Received on Fri Aug 24 2012 - 11:26:08 CDT

Original text of this message