Re: Performance issue - high Free buff wait

From: Pap <oracle.developer35_at_gmail.com>
Date: Thu, 14 Jan 2021 01:23:16 +0530
Message-ID: <CAEjw_fiJ24phbuRZ23DjL6aXjQ5VbjCN0Ggu38WOkm_3y8qZmQ_at_mail.gmail.com>





Thank You so much.

Attached is the DASH results. I am not sure how I should interpret the WAIT_CHAIN column. But it shows Top ~338 sessions showing to be on ON CPU only and contributing to ~29% activity followed by free buffer waits which is ~17% of the overall activity.

Want to understand, from what figure you came to know this one *"you have 64 cores most probably you reached almost peak CPU usage during that period "?*

Does it mean that we are actually overloading the system during those ~15minutes duration and thus the contention and slowness is observed considering current ~64 cpu capacity and thus only option here should be to ask the application team to reduce the max connection at any point in time to the database or say reducing the max session limit at DB side?

Regarding the rollbacks , i need to check , if those are real "rollbacks" or some keep alive query from application is logging those.

Thanks And Regards
Pap

On Wed, Jan 13, 2021 at 11:53 PM Laurentiu Oprea < laurentiu.oprea06_at_gmail.com> wrote:

> Hello Pap,
>
> Is not just those wait events, there is clearly a significant
> difference in load between the 2 periods you mentioned.
>
> -> the moment you open the excel you can see DB time 237.29 (mins)
> vs 874.29 (mins)
> Based on the fact that you have 64 cores most probably you reached
> almost peak CPU usage during that period maybe you was in the position
> where processes competed to get time on cpu.
>
> going down
> -> db block changes 9720600 vs 53870404 looks like transactions do more
> work ( more than 5 times more)
> -> redo size 2448278000 vs 11624470400 correlated with above metric you
> generated 5 times more redo info
> -> user rollbacks / user commits -> both look high , you can check with
> dev guys why you commit so often and why you have the rollbacks (do you
> have any errors in alertlog, to they have any timeout at app level)
>
> As you might be cpu bound during this peak period you might be storage
> bound as well. It is possible for the whole system to be slowed down by
> LGWR.
> You can validate this using Tanel`s chain waits:
>
> https://github.com/tanelpoder/tpt-oracle/blob/master/ash/dash_wait_chains.sql
>
> you can use it like this:
> _at_dash_wait_chains event2 1=1 "TIMESTAMP'2021-01-12 20:00:00'"
> "TIMESTAMP'2021-01-12 20:15:00'"
> and post output
>
> Good luck.
>
>
> În mie., 13 ian. 2021 la 16:05, Pap <oracle.developer35_at_gmail.com> a
> scris:
>
>> Not sure why , but the first email I dropped on this question was bounced
>> back from oracle-l, so trying again.
>>
>> Hello experts,
>>
>> We have a database with Oracle version 11.2.0.4. We are getting
>> complaints of slowness in one of our jobs mainly for ~15minutes duration on
>> a daily basis. And there is no plan change or any Undo reads for the
>> underlying sqls and also there is not one sql which we can singled out for
>> that job duration and that job consists of many small queries(selects,
>> inserts running many thousand times in that duration). But while
>> seeing/comparing the AWR for that ~15minutes period VS another ~15minutes
>> just before that, we are seeing a few odd waits , they are a combination of
>> Configuration(free buffer waits) followed by concurrency(buffer busy, index
>> contention etc) foreground waits. We do see comparatively higher sessions
>> during this ~15minutes window as compared to normal time.
>>
>> In the AWR under section "IOStat by Function summary" the DB writer avg
>> response time is logged as ~103 ms vs in normal period it stays ~35ms. And
>> then checking v$iostat_file , we see it has ASYNCH_IO set as OFF for both
>> "data file" and "temp file" which I have seen in many databases normally
>> set to ON. Also in dba_hist_active_sess_history the top waits events are
>> showing as below. I have attached excel with specific sections of AWR
>> during the issue period and normal period in two different tabs.
>>
>> 1)So , is it correct that by turning this ASYNCH_IO "ON" for data/temp
>> file , will cater all these issues because these all seem to be triggering
>> from the slow DB writer performance?
>>
>> Or
>>
>> 2)Should we ask the application team to reduce the total number of
>> sessions(maybe by decreasing the max connection limit) which are submitting
>> to the Database at that point to reduce contention?
>>
>> *Below is count of waits from Dba_hist_active_sess_history for that
>> ~15minutes window:-*
>>
>> Top two(log file sync and db file async I/O submit) are not associated
>> with application user/session.
>>
>> EVENT COUNT(*)
>> log file sync 260
>> db file async I/O submit 188
>> free buffer waits 72
>> 66
>> 65
>> 63
>> 62
>> db file sequential read 56
>> write complete waits 54
>> 54
>> db file sequential read 51
>>
>> Regards
>>
>> Pap
>>
>>
>>
>>



--
http://www.freelists.org/webpage/oracle-l



  • application/vnd.openxmlformats-officedocument.spreadsheetml.sheet attachment: DASH_Results.xlsx
Received on Wed Jan 13 2021 - 20:53:16 CET

Original text of this message