Re: Exadata storage cell metrichistory for controlfile

From: Karl Arao <karlarao_at_gmail.com>
Date: Sun, 3 Apr 2022 17:24:09 -0400
Message-ID: <CACNsJncFtvEqKrcA40_zmXaRa1otOFAto5VLCB4417ChJ0fE5Q_at_mail.gmail.com>



On your enq cf issue
it may be related to this or it could be something else - Bug 12904429 : RMAN DELETE ACHIVELOG SCANS FILENAME SECTIONS WHILE HOLDING CF ENQUEUE I suggest to open a service request for this issue and upload your diagnostic data (trace files generated, alert log, etc.) The key info on the trace files is finding the enqueue blocker and its call stack. You may find the blocker is somehow stuck in some system call or held up by other stuff.

On the instrumentation where the perf data in memory is not persisted when the instance crashes.
There are a few ways you can approach this.

  1. The simplest is making your AWR snapshots shorter. You can do a 1min AWR snap through a job or make it even shorter through a script. If you can reproduce your issue at will then enable the AWR_1MIN_SNAP job, do your test, then since your issue is enqueue it should take a while before it finally crashes and when it does you would have enough data to see what's happening leading to the crash. From there what I would do is check on ASH wait chains https://karlarao.github.io/karlaraowiki/index.html#%5B%5B1min%20AWR%20snapshot%5D%5D
  2. You can dump the data periodically as CSV something like this https://github.com/karlarao/gvash_to_csv you'll see on the repo there's a doc that shows how to save the data using metric extension (you probably don't need this in your case)

-Karl

On Fri, Apr 1, 2022 at 11:41 PM Moustafa Ahmed <moustafa_dba_at_hotmail.com> wrote:

> Karl
>
> The issue in hand results in instance crash before persisting data into
> AWR (DBA_HIST*)
> So the usage of any related AWR view here won’t work.
> Simply the control files on data and reco disk groups gets smashed with
> reads and writes causing lgwr to get held in “enq cf - contention“
> So basically I’m trying to find which controlfile data vs reco is the one
> that suffers the most ? We already know it is archive logs delete that
> causes these massive cf writes and reads.
>
>
>
> On Apr 1, 2022, at 8:19 PM, Karl Arao <karlarao_at_gmail.com> wrote:
>
> 
> Hi Moustafa,
>
> LIST METRICHISTORY is categorized by cell, db, and cg (consumer group)
> then from here it could be split into CD (celldisk) and FD (flashdisk).
> you can confirm this with
> LIST METRICDEFINITION ATTRIBUTES objectType, metricType, name, unit,
> description
>
> the only way to get a by disk group breakdown is by using the views -
> gv$asm_disk_stat, gv$asm_disk_iostat, gv$asm_diskgroup_stat
>
> On the example image below, CD_00_vceladm01 is split into DATA and RECO
> disks, which then becomes {DATA,RECO}_CD_00_vceladm01
>
> CD_00_vceladm01 is what you'll see categorized into cell, db, and cg in
> LIST METRICHISTORY
> {DATA,RECO}_CD_00_vceladm01 is then abstracted by gv$asm_* views
>
>
>
> https://user-images.githubusercontent.com/3683046/161356058-ac43d058-afce-416b-9058-947f2219d4ec.png
>
>
>
> I use Bertrand's script to show by disk group and by db level IOPS
>
> script here: https://github.com/bdrouvot/asm_metrics
> usage:
> ./asm_metrics.pl -show=dbinst -display=snap,avg -interval=5
> -sort_field=iops
> ./asm_metrics.pl -show=dbinst,dg -display=snap,avg -interval=5
> -sort_field=iops
>
>
> -Karl
>
>
>
>
> On Fri, Apr 1, 2022 at 12:38 PM Moustafa Ahmed <moustafa_dba_at_hotmail.com>
> wrote:
>
>> Hello listers
>>
>> On Exadata is it possible to use (list metrichistory) to show stats for
>> an object like control file on specific diskgroup?
>> Also is it possible to use (list metrichistory) for a specific disk group
>> in general ?
>>
>> Thank you!--
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Apr 03 2022 - 23:24:09 CEST

Original text of this message