Re: IO latency due to disk scrub

From: Rajesh Aialavajjala <r.aialavajjala_at_gmail.com>
Date: Sun, 2 May 2021 09:31:04 -0400
Message-ID: <CAGvtKv4+ruAVGmYh4GfgFKqFarJnJt3L+fLVRUbYDU4PU7b7aQ_at_mail.gmail.com>



Pap,
 It appears that the Bug that I had referenced is an unpublished one. Reference = "Disk Scrubbing causing I/O Performance issue on Exadata Cell Server (Doc ID 1684213.1)"

Do you have HC disks or HP disks in your X3 (presuming 2 socket) machine ? The X3 as you know offered both HC and HP disks (*168 x 600 GB 15,000 RPM High Performance disks or 168 x 3 TB 7,200 RPM High Capacity disks) *

I had a question about the Exadata image version - you mentioned "*exadata image version is 19.0.0.0.0*" - there was no 19.0 ESS - the first release of ESS 19.x was 19.1.0.0.0...

1)Is it okay to stop the disk scrubbing fully or say changing frequency to once in a month? What would be the negative sides of setting this off or minimizing the suggested frequency of disk scrubbing?

>>> I would not suggest turning off scrubbing entirely. You may alter the
interval pursuant to the MOS note you referenced.

2)One thing we observe during this period the hard disk IO utilization goes up till ~80% but no such impact/spike observed on flash disk IO utilization. So does it mean that scrubbing only happens to cell hard disk but not cell flash disk?

>>> To the best of my understanding - the "scrub" process only applies to
spinning disk

3)I see below section in Oracle doc, which states , scrubbing is not required for EF disk or cells, so does it mean that our current X3 machine is not extreme flash but combination of flash disk + spinning disk/hard disks and thus the hard disk is going through disk scrubbing and thus impacting only hard disk IO without having any negative impact on flash IO?

>>> Disk scrubbing on your X3-2 HC cells - will only function on the
spinning drives - not on the PCI flash cards...

Thanks,

--Rajesh

On Thu, Apr 29, 2021 at 2:49 PM Pap <oracle.developer35_at_gmail.com> wrote:

> The database version is 12.1.0.2.0 and the exadata image version is
> 19.0.0.0.0. Do you have some reference to the exact bug which can cause
> such disk latency during disk scrubbing?
>
> We see it's mainly the "cell multiblock physical read" response time
> which increased from ~10ms to ~20ms and impacted the query as its response
> time doubled i.e. increased from ~15minutes to ~30minutes.
>
> I have few more doubts,
>
> 1)Is it okay to stop the disk scrubbing fully or say changing frequency to
> once in a month? What would be the negative sides of setting this off or
> minimizing the suggested frequency of disk scrubbing?
>
> 2)One thing we observe during this period the hard disk IO
> utilization goes up till ~80% but no such impact/spike observed on flash
> disk IO utilization. So does it mean that scrubbing only happens to cell
> hard disk but not cell flash disk?
>
> 3)I see below section in Oracle doc, which states , scrubbing is not
> required for EF disk or cells, so does it mean that our current X3 machine
> is not extreme flash but combination of flash disk + spinning disk/hard
> disks and thus the hard disk is going through disk scrubbing and thus
> impacting only hard disk IO without having any negative impact on flash IO?
>
> EXADATA HardDisk Scrubbing (Doc ID 2094581.1)
>
> Is Disk Scrubbing needed on Extreme Flash Cells and Disks
>
> Scrubbing only is necessary on spinning disk cells (High Capacity or older
> High Performance models) and that it's not necessary to configure on EF
> disks or cells.
>
>
>
>
> On Thu, Apr 29, 2021 at 6:46 PM Rajesh Aialavajjala <
> r.aialavajjala_at_gmail.com> wrote:
>
>> Pap,
>> You did not mention the version of the Exadata software that is running
>> on the mentioned X3. What version are you running ?
>>
>> There was - to my understanding - an old Bug - that documented the disk
>> scrub impacting user workloads...which should be fixed by now...The
>> scrubbing is designed to only run when average I/O utilization is under 25%
>>
>> Parameters controlling the disk scrubbing:
>>
>> hardDiskScrubInterval - sets the interval for proactive resilvering of
>> latent bad sectors. Valid options are daily, weekly, biweekly and none.
>> Using the none option stops all disk scrubbing.
>> hardDiskScrubStartTime command sets the start time for proactive
>> resilvering of latent bad sectors. Valid options are a date/time
>> combination or now.
>>
>> Thanks,
>>
>> --Rajesh
>>
>>
>>
>> On Tue, Apr 27, 2021 at 12:15 AM Pap <oracle.developer35_at_gmail.com>
>> wrote:
>>
>>> Hi Listers, We are using the Exadata X3 machine and its Full Rack i.e.
>>> 14 storage cell servers. We are seeing high disk IO utilization while the
>>> disk scrubbing was in progress. It is impacting the application jobs. This
>>> process made the disk IO utilization spiked from ~20% in BAU time to
>>> ~80-85% during the time disk scrubbing is in progress. And this process
>>> runs once every two week, which I believe is the default schedule.
>>>
>>> My question was , if it's okay to reschedule this
>>> process(harddiskscrubinterval) and make it monthly once? Does it have any
>>> negative impact? And if any other possible solution to make it not degrade
>>> the disk IO to so much extent?
>>>
>>> Regards
>>> Pap
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun May 02 2021 - 15:31:04 CEST

Original text of this message