Re: exadata write performance problems

From: k a <karlarao_at_gmail.com>
Date: Wed, 13 Feb 2019 14:22:21 -0500
Message-ID: <CACNsJncEnjjo5LLdPbZ7LVTRHzu1xHGu3vV4W5v4ZOkxFUJT1A_at_mail.gmail.com>



I think even if you upgrade to 1/4 rack you will still experience high latency on hard disks, at 5000+ IOPS and 60-200ms latency that means you are pushing more and 1/4 rack can only do 7800 hard disk IOPS according to the X6 data sheet.

You would really need to evaluate enabling the WBFC. Do you have any lower environment where you can test this and run a similar workload?

On top of enabling WBFC, you have to do implement resource management. You can start with IORM because you have 10 databases. But to be able to come up with a sound IORM plan you need to do some further validation or breakdown your workload and assess which ones makes sense to prioritize or limit.

If you don't opt to enable WBFC and go with flash based grid disk then you have to move your hot objects on that new flash based tablespace. So you need to qualify which ones to move across the 10 databases. And then at the end of the day you may still hit the limit of your hard disk IOPS because of some runaway process working on objects that are not flash based. In my opinion, this requires more maintenance.

-Karl

On Wed, Feb 13, 2019 at 1:08 AM Ls Cheng <exriscer_at_gmail.com> wrote:

> Hi Gopal
>
> Yes I understand we have too many write IOPS because this only happens
> when some massive DML kicks in, the problem is there are 10 databases
> running and any of them can cause users blocked by enq KO event (which is
> actually happening). So the problem is identified but to expand the cells
> from 1/8 rack to 1/4 rack takes a few months due to company's internal
> process so we have to mitigate in the mean time these write problems.
>
> The Flash is being used as write-through cache so read performance is
> good, we only have problems with write performance so this happens probably
> 4, 5 times during the day (when some batch runs). The cells are x6-2 and
> running 18.1.5.0.0.
>
> I also obseve something strange, a 128 KB table with 50 rows is being
> Smart Scanned in the cells when the buffer cache is 25GB. This also
> increase checkpoint activity as well.
>
> Thank you
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_-2161378723485882335_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Wed, Feb 13, 2019 at 2:38 AM K Gopalakrishnan <kaygopal_at_gmail.com>
> wrote:
>
>> Hi- Looks like you are oversubscribing IO (random IO) by 300%. KO-
>> sometimes come from truncate which will induce a checkpoint (and
>> controlfile update) on a already overloaded IO.
>> One option is to enlarge buffer cache (thus reducing the disk IO) or
>> probably try creating a DG out of flash and try high IO tablespaces (and
>> undo as well)..
>> In short- you are short on IOPS. Try to make use of FLASH cards from
>> storage servers? BTW which version / model of cell servers are you using?
>>
>> -KG
>>
>> On Tue, Feb 12, 2019 at 3:17 PM Ls Cheng <exriscer_at_gmail.com> wrote:
>>
>>> Hi
>>>
>>> IHAC who has 1/8 Exadata x6-2 with High Capacity Disks is having heavy
>>> performance problems whenever some massive DML operation kicks in, since
>>> this is a 1/8 configuration the IOPS supporting write operation is not
>>> high, roughly 1200 IOPS. I am seeing as high as 4000 Physical Writes Per
>>> Sec in peak time. When this happens the user session starts suffering
>>> because they are blocked by enq: KO - fast object checkpoint which is
>>> blocked by "control file parallel write" by CKPT. So the idea is aliviate
>>> CKPT. This is from hist ash
>>>
>>> INSTANCE_NUMBER SAMPLE_TIME EVENT
>>> TIME_WAITED SESSION P1 P2 P3
>>> --------------- --------------------------------
>>> ------------------------------ ----------- ------- ---------- ----------
>>> ----------
>>> 2 12-FEB-19 12.11.24.540 AM control file parallel
>>> write 1110465 WAITING 2 41 2
>>> 2 12-FEB-19 12.16.34.754 AM Disk file Mirror Read
>>> 1279827 WAITING 0 1 1
>>> 1 12-FEB-19 12.16.44.012 AM control file parallel
>>> write 1820977 WAITING 2 39 2
>>> 2 12-FEB-19 12.20.34.927 AM control file parallel
>>> write 1031042 WAITING 2 856 2
>>> 1 12-FEB-19 12.21.14.256 AM control file parallel
>>> write 1905266 WAITING 2 3 2
>>> 2 12-FEB-19 12.21.14.977 AM control file parallel
>>> write 1175924 WAITING 2 42 2
>>> 1 12-FEB-19 12.21.54.301 AM control file parallel
>>> write 2164743 WAITING 2 855 2
>>> 2 12-FEB-19 12.22.35.036 AM control file parallel
>>> write 1581684 WAITING 2 4 2
>>> 1 12-FEB-19 12.23.44.381 AM control file parallel
>>> write 1117994 WAITING 2 3 2
>>> 1 12-FEB-19 12.23.54.404 AM control file parallel
>>> write 4718841 WAITING 2 3 2
>>>
>>> Whe this happens we observe these cell metrics
>>>
>>> CELL METRICS SUMMARY
>>>
>>> Cell Total Flash Cache: IOPS=13712.233 Space allocated=6083152MB
>>> == Flash Device ==
>>> Cell Total Utilization: Small=27.8% Large=14.2%
>>> Cell Total Throughput: MBPS=471.205
>>> Cell Total Small I/Os: IOPS=9960
>>> Cell Total Large I/Os: IOPS=6005
>>>
>>> == Hard Disk ==
>>> Cell Total Utilization: Small=69.5% Large=18.7%
>>> Cell Total Throughput: MBPS=161.05
>>> Cell Total Small I/Os: IOPS=5413.618
>>> Cell Total Large I/Os: IOPS=166.2
>>> Cell Avg small read latency: 245.67 ms
>>> Cell Avg small write latency: 62.64 ms
>>> Cell Avg large read latency: 308.99 ms
>>> Cell Avg large write latency: 24.65 ms
>>>
>>>
>>> We cannot not enable write-back flash cache right now because that may
>>> cause another problems and although we are in process to upgrade 1/8 cells
>>> to 1/4 cells it is going to take some months. I know it is not a best
>>> practice but I was thinking in the mean time scarve some flash space and
>>> create them as grid disk and store the controlfiles in Flash. Anyone have
>>> experience with such setup?
>>>
>>> TIA
>>>
>>>
>>>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_-2161378723485882335_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Feb 13 2019 - 20:22:21 CET

Original text of this message