Re: parallel recovery slaves waiting on undo reads

From: Rich J <rich242j_at_gmail.com>
Date: Wed, 26 Feb 2020 13:29:04 -0600
Message-ID: <CAANsBX2tP_5KOGDBotpewDaTDK8TZhtZQ4n+2bKcDtP-oQ7k0Q_at_mail.gmail.com>



Interesting...is this Active Data Guard? In my 12.1.0.2 ADG physical standby, one of my biggest pains is library cache locks by LGWR, occasionally lasting up to 7 minutes each. This is in max performance mode for a reporting database. No issues with PR threads that I can see. And neither of those problems in 11.2 before I upgraded, either.

Rich

On Wed, Feb 26, 2020 at 10:28 AM Andy Sayer <andysayer_at_gmail.com> wrote:

> It’s probably worth seeing if you can catch individual events in
> v$session_wait_history (shows the last 10 events per session). Importantly,
> see if the longer durations match up to more requests in the same event.
>
> P1 does not match up to the file#, it is the number of files. I don’t
> think you can get the individual reads requests from a parallel read from
> the standard wait interface. There may be an additional event you can set
> or you might need to use OS tracing to see what files are being accessed.
>
> Hope this helps,
> Andy
>
> On Wed, 26 Feb 2020 at 17:18, Noveljic Nenad <nenad.noveljic_at_vontobel.com>
> wrote:
>
>> The parallel recovery slaves on a 12.1 physical standby are spending 80%
>> of time in 'db file parallel read':
>>
>>
>>
>> select program,event,count(*) from v$active_session_history where program
>> like '%PR%' and event='db file parallel read' and sample_time between
>> sysdate-5/24/60 and sysdate group by program,event order by count(*) ;
>>
>>
>>
>> PROGRAM
>> EVENT COUNT(*)
>>
>> ------------------------------------------------
>> ---------------------------------------------------------------- ----------
>>
>> oracle_at_svavt02a (PR04) db file parallel
>> read 202
>>
>> oracle_at_svavt02a (PR03) db file parallel
>> read 217
>>
>> oracle_at_svavt02a (PR02) db file parallel
>> read 230
>>
>> oracle_at_svavt02a (PR01) db file parallel
>> read 237
>>
>>
>>
>> Most of the reads are on UNDO:
>>
>>
>>
>> select program,event,t.name, count(*)
>>
>> from v$active_session_history a, v$datafile f, v$tablespace t
>>
>> where program like '%PR%' and event='db file parallel read'
>>
>> and sample_time between sysdate-5/24/60 and sysdate
>>
>> and a.p1 = f.file# and f.ts#=t.ts#
>>
>> group by program,event,t.name order by count(*) desc ;
>>
>>
>>
>>
>> PROGRAM
>> EVENT
>> NAME COUNT(*)
>>
>> ------------------------------------------------
>> ----------------------------------------------------------------
>> ------------------------------ ----------
>>
>> oracle_at_svavt02a (PR02) db file parallel
>> read
>> UNDOTBS1 202
>>
>> oracle_at_svavt02a (PR01) db file parallel
>> read
>> UNDOTBS1 194
>>
>> oracle_at_svavt02a (PR04) db file parallel
>> read
>> UNDOTBS1 193
>>
>> oracle_at_svavt02a (PR03) db file parallel
>> read
>> UNDOTBS1 187
>>
>> oracle_at_svavt02a (PR01) db file parallel
>> read
>> U1 11
>>
>> oracle_at_svavt02a (PR04) db file parallel
>> read
>>
>> ...
>>
>>
>>
>> The average wait time strikes out - 18 ms (!):
>>
>>
>>
>> SQL> select event,total_waits,time_waited,average_wait,max_wait from
>> v$session_event where sid=545 order by time_waited desc
>> ;
>>
>>
>>
>> EVENT
>> TOTAL_WAITS TIME_WAITED AVERAGE_WAIT MAX_WAIT
>>
>> ----------------------------------------------------------------
>> ----------- ----------- ------------ ----------
>>
>> db file parallel read
>> 540946 972298 1.8 553
>>
>> parallel recovery slave next change
>> 172941 175034 1.01 325
>>
>> free buffer waits
>> 29682 29615 1 13
>>
>> write complete waits
>> 85 5005 58.88 548
>>
>> ...
>>
>>
>>
>> However, that's not an IO problem - most of the reads complete in less
>> than 0.5 ms. Below is the distribution of the pread elapsed times in ns:
>>
>>
>>
>>
>>
>> ns
>>
>> value ------------- Distribution ------------- count
>>
>> 4096 | 0
>>
>> 8192 | 302
>>
>> 16384 |_at__at_@ 1802
>>
>> 32768 |_at__at_ 1313
>>
>> 65536 | 99
>>
>> 131072 |_at_ 487
>>
>> 262144 |_at__at_@@@@@@@@@ 6761
>>
>> 524288 |_at__at_@@@@@@@@@@@@@ 9270
>>
>> 1048576 |_at__at_@@@@ 3539
>>
>> 2097152 |_at_ 756
>>
>> 4194304 | 45
>>
>> 8388608 | 2
>>
>> 16777216 | 4
>>
>> 33554432 | 0
>>
>>
>>
>>
>>
>> The output is, by the way, generated by the following script:
>>
>>
>>
>> #pragma D option quiet
>>
>>
>>
>> pid$target::pread:entry
>>
>> {
>>
>> self->started = timestamp;
>>
>> /* printf("%d\n",arg0); */
>>
>> }
>>
>>
>>
>> pid$target::pread:return
>>
>> / self->started /
>>
>> {
>>
>> this->duration = timestamp - self->started ;
>>
>> _at_times["ns"] = quantize(this->duration);
>>
>> self->started = 0 ;
>>
>> }
>>
>>
>>
>>
>>
>> Obviously, 'db file parallel read' measures much more than mere IO or
>> even worse, it might be completely broken.
>>
>>
>>
>> Any idea what that could be?
>>
>>
>>
>> I'll switch to async IO tomorrow - maybe its code path is better
>> instrumented.
>>
>>
>>
>> Any other ideas what to look for?
>>
>>
>>
>> Thanks in advance,
>>
>>
>>
>> Nenad
>>
>>
>>
>> https://nenadnoveljic.com/blog/
>>
>>
>>
>> ____________________________________________________
>>
>> Please consider the environment before printing this e-mail.
>>
>> Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.
>>
>>
>> Important Notice
>> This message is intended only for the individual named. It may contain
>> confidential or privileged information. If you are not the named addressee
>> you should in particular not disseminate, distribute, modify or copy this
>> e-mail. Please notify the sender immediately by e-mail, if you have
>> received this message by mistake and delete it from your system.
>> Without prejudice to any contractual agreements between you and us which
>> shall prevail in any case, we take it as your authorization to correspond
>> with you by e-mail if you send us messages by e-mail. However, we reserve
>> the right not to execute orders and instructions transmitted by e-mail at
>> any time and without further explanation.
>> E-mail transmission may not be secure or error-free as information could
>> be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also
>> processing of incoming e-mails cannot be guaranteed. All liability of
>> Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively
>> referred to as "Vontobel Group") for any damages resulting from e-mail use
>> is excluded. You are advised that urgent and time sensitive messages should
>> not be sent by e-mail and if verification is required please request a
>> printed version. Please note that all e-mail communications to and from the
>> Vontobel Group are subject to electronic storage and review by Vontobel
>> Group. Unless stated to the contrary and without prejudice to any
>> contractual agreements between you and Vontobel Group which shall prevail
>> in any case, e-mail-communication is for informational purposes only and is
>> not intended as an offer or solicitation for the purchase or sale of any
>> financial instrument or as an official confirmation of any transaction.
>> The legal basis for the processing of your personal data is the
>> legitimate interest to develop a commercial relationship with you, as well
>> as your consent to forward you commercial communications. You can exercise,
>> at any time and under the terms established under current regulation, your
>> rights. If you prefer not to receive any further communications, please
>> contact your client relationship manager if you are a client of Vontobel
>> Group or notify the sender. Please note for an exact reference to the
>> affected group entity the corporate e-mail signature. For further
>> information about data privacy at Vontobel Group please consult
>> www.vontobel.com.
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Feb 26 2020 - 20:29:04 CET

Original text of this message