Re: Controlling load
Date: Fri, 24 Feb 2023 02:00:34 +0530
Message-ID: <CAEzWdqe=wSGUsF7a7jAByqLuj3Kz_Pop77wBFK_rAd_eMprMtQ_at_mail.gmail.com>
*"Maybe the attribute you are looking for is CLIENT_OS_USER? This is the "osuser" column in v$session"*
The osuser is also exactly the same for both the batch and online program. As I mentioned in this thread earlier, the difference is only in the first part of the Program before "_at_". Not sure if by any means I would be able to create a separate consumer group based on that?
For batch the program in v$session showing up as --> "batch_prog_at_vm000123 (TNS V1-V3)" and for online --> "online_prog_at_vm000123 (TNS V1-V3)"
On Wed, Feb 22, 2023 at 6:24 PM Ilmar Kerm <ilmar.kerm_at_gmail.com> wrote:
> There are multiple attributes for mapping sessions to consumer groups:
>
> https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_RESOURCE_MANAGER.html#GUID-7B9B9B1D-53E3-4BF4-8BE6-858256702877
>
> https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_RESOURCE_MANAGER.html#GUID-8ACB30CB-3E37-41C2-801C-4976A6CFA904
>
> Maybe the attribute you are looking for is CLIENT_OS_USER? This is the
> "osuser" column in v$session
>
> Also don't forget the mapping priorities:
>
> https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_RESOURCE_MANAGER.html#GUID-32AA7BD1-5884-4C25-A67A-22F48983851B
>
> And privileges, the database users must have privileges to switch to the
> consumer groups:
>
> https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_RESOURCE_MANAGER_PRIVS.html#GUID-C3584C41-A8C5-4DBE-847D-DA990B1F8CD1
>
>
> On Wed, Feb 22, 2023 at 2:32 PM yudhi s <learnerdatabase99_at_gmail.com>
> wrote:
>
>> Thank you Ilmar.
>>
>> Going through the example which is in the doc - 848666.1, and its making
>> the consumer group based on the DB username. In our case, these two
>> programs are using exactly same osuser, Username, schemaname, service_name.
>> And the client_id,action is populated as null for both of these. The only
>> difference is program column i.e. for batch it looks something like
>> "batch_prog_at_vm000123 (TNS V1-V3)" and for online "online_prog_at_vm000123
>> (TNS V1-V3)", and ~40+ different "VM******" machines from which these are
>> coming from. So wondering, would we still be able to create different
>> resource consumer group , just based on this substring of program column?
>>
>> Additionally, to get the estimate of the CAP/Number of session which we
>> should put for the BATCH programs "resource consumer group". How we should
>> do that? We have total 48*2=96 cores per node. I tried getting the TOTAL
>> AAS aginst those two programs side by side as below. If we eliminate the
>> peak/issue period , the avg active sessions per minute is staying <10 for
>> both the cases. Should we keep it as 40 as the limit for prog_batch?
>>
>> https://gist.github.com/databasetech0073/dde4804a030814c93fc2111616cd2407
>>
>> Regards
>> Yudhi
>>
>>
>>
>> On Wed, 22 Feb, 2023, 3:12 pm Ilmar Kerm, <ilmar.kerm_at_gmail.com> wrote:
>>
>>> You could try:
>>>
>>> https://docs.oracle.com/en/database/oracle/oracle-database/19/arpls/DBMS_RESOURCE_MANAGER.html#GUID-8EC6C735-338D-46D4-B346-AD16D0622B30
>>> active_sess_pool_p1 - Specifies maximum number of sessions that can
>>> currently have an active call
>>> queueing_p1 - Specified time (in seconds) after which a call in the
>>> inactive session queue (waiting for execution) will time out. Default is
>>> NULL, which means unlimited.
>>>
>>> Example in: How to Limit max number of concurrently active sessions with
>>> Resource Manager (Doc ID 848666.1)
>>>
>>> On Wed, Feb 22, 2023 at 10:42 AM yudhi s <learnerdatabase99_at_gmail.com>
>>> wrote:
>>>
>>>> Thank you for the response.
>>>>
>>>>  As i mentioned we have 48 core CPU with 2 socket each for a node, so
>>>> Total ~96 core per node. And the OEM also its showing showing the red line
>>>> bar on AAS as 96 as Max limit. But during three different issue period we
>>>> see the below AAS shoot ups above 96 and the wait distribution,
>>>>  Basically its combination of wait event from classes like concurrency,
>>>> configuration and others but CPU part is minimum.
>>>>
>>>>  1) AAS shoot up to - 180, out of which 7% CPU and 34% (buffer busy
>>>> waits), ~20% Log file switch checkpoint incomplete, 16% enq:tx allocate ITL
>>>> entry, 12% Others.
>>>>
>>>>  2) AAS shoot up to - 515, out of which 4% CPU and 46% (library cache
>>>> lock), ~17% library cache mutex, 12% enq:tx allocate ITL entry entry, 12%
>>>> Others.
>>>>
>>>>  3) AAS shoot up to - 120, out of which 9% CPU and 30% (buffer busy
>>>> waits), ~13% Log file switch checkpoint incomplete, 31% enq:tx allocate ITL
>>>> entry entry, 8% BCT buffer space.
>>>>
>>>> But i think you are correct putting CAP on CPU using DB resource
>>>> manager may not going to help us in above situation as cpu portion is less
>>>> than 10%. However, i was thinking if any possible way to put cap on the AAS
>>>> or the number of session those were spawning from the program prog_batch as
>>>> because they are the one causing the chaos by submitting lot many session
>>>> simultaneously. So is that possible with the DB resource manager
>>>> configuration or by tweaking the existing services in anyway?
>>>>
>>>> Regards
>>>> Yudhi
>>>>
>>>>
>>>> On Wed, 22 Feb, 2023, 12:54 pm Ilmar Kerm, <ilmar.kerm_at_gmail.com>
>>>> wrote:
>>>>
>>>>> Resource manager can help you if you are seeing resource shortage (CPU
>>>>> shortage) during these periods and CPU shortage is causing the Concurrency
>>>>> waits.
>>>>> If you are not seeing CPU shortage, then you need traditional database
>>>>> troubleshooting methods and resource manager will not help you.
>>>>> For example if you are behind in 19c patching, then for example before
>>>>> 19.15 there was bug 32898678, 33541865 (fixed in 19.17) and multiple others
>>>>> that caused block change tracking (BCT) to freeze the database or just high
>>>>> BCT buffer space waits.
>>>>>
>>>>> On Wed, Feb 22, 2023 at 6:08 AM yudhi s <learnerdatabase99_at_gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you Andy.
>>>>>>
>>>>>> I have never used the DB resource consumer group in our databases
>>>>>> though. But going through the doc, it seems it can be set for many number
>>>>>> of input parameters and also on many resources. Yet to test it , however In
>>>>>> most cases I see it's set on amount of cpu. So in our case will it be good
>>>>>> idea to determine the cpu percentage and define the plan or we can do it on
>>>>>> number of average active sessions which will be again hardcoded value only.
>>>>>>
>>>>>>  We need to set it for multiple input program values in our case. If
>>>>>> you can pass on reference to some sample doc with its exact code
>>>>>> implementation, that would be great.
>>>>>>
>>>>>> On Wed, 22 Feb, 2023, 2:11 am Andy Sayer, <andysayer_at_gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> You can set up resource manager to map different programs to
>>>>>>> different resource consumer groups. You’ll need to make sure you order the
>>>>>>> priority of mapping to make sure program is checked before service name (if
>>>>>>> that’s not the default). It’s all in dbms_resource_manager
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Andy
>>>>>>>
>>>>>>> On Tue, Feb 21, 2023 at 12:19 PM, yudhi s <
>>>>>>> learnerdatabase99_at_gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello Listers,
>>>>>>>> On a normal day, we see sessions/AAS per ~1 minute interval as ~50
>>>>>>>> in this database. and they are mainly from two specific 'programs' (say
>>>>>>>> e.g. prog_batch, prog_online). But during certain periods of the day we saw
>>>>>>>> a large number of sessions with programs as 'prog_batch' (i.e. with
>>>>>>>> AAS>~500) flooding the database and we are seeing large concurrency waits,
>>>>>>>> bct buffer space waits all over the database. And its impacting critical
>>>>>>>> sessions of other programs i.e. prog_online. So considering 'prog_batch' is
>>>>>>>> meant for batch kind of processes and it's okay for those sessions to
>>>>>>>> queue-up and run a few minutes longer, but at the same time we can't afford
>>>>>>>> to impact the sessions from prog_online, as those are latency sensitive
>>>>>>>> online users. But we found both of these programs are pointing to the same
>>>>>>>> database services and running on the same node-1. So , apart from
>>>>>>>> controlling the number of sessions/connections from the application end,
>>>>>>>> can we do some easy fixes with regards to the service config level, which
>>>>>>>> will better control the incoming load? Or say putting some cap on the
>>>>>>>> incoming sessions from prog_batch?
>>>>>>>> This is a 19C database with 2 node RAC and its an Exadata machine.
>>>>>>>> Each node is a 48 core, 2 socket. Both the programs were using the same
>>>>>>>> service and having preferred nodes as node-1.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Yudhi
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> Ilmar Kerm
>>>>>
>>>>
>>>
>>> --
>>> Ilmar Kerm
>>>
>>
>
> --
> Ilmar Kerm
>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Feb 23 2023 - 21:30:34 CET
