Re: exadata write performance problems

From: K Gopalakrishnan <kaygopal_at_gmail.com>
Date: Tue, 12 Feb 2019 19:37:37 -0600
Message-ID: <CAN5iexEbvKocrrkUcMyp+rh99BeNRGxpJ9igYJ9V++YpX0XF7g_at_mail.gmail.com>



Hi- Looks like you are oversubscribing IO (random IO) by 300%. KOsometimes  come from truncate which will induce a checkpoint (and controlfile update) on a already overloaded IO. One option is to enlarge buffer cache (thus reducing the disk IO) or probably try creating a DG out of flash and try high IO tablespaces (and undo as well)..
In short- you are short on IOPS. Try to make use of FLASH cards from storage servers? BTW which version / model of cell servers are you using?

-KG

On Tue, Feb 12, 2019 at 3:17 PM Ls Cheng <exriscer_at_gmail.com> wrote:

> Hi
>
> IHAC who has 1/8 Exadata x6-2 with High Capacity Disks is having heavy
> performance problems whenever some massive DML operation kicks in, since
> this is a 1/8 configuration the IOPS supporting write operation is not
> high, roughly 1200 IOPS. I am seeing as high as 4000 Physical Writes Per
> Sec in peak time. When this happens the user session starts suffering
> because they are blocked by enq: KO - fast object checkpoint which is
> blocked by "control file parallel write" by CKPT. So the idea is aliviate
> CKPT. This is from hist ash
>
> INSTANCE_NUMBER SAMPLE_TIME EVENT
> TIME_WAITED SESSION P1 P2 P3
> --------------- --------------------------------
> ------------------------------ ----------- ------- ---------- ----------
> ----------
> 2 12-FEB-19 12.11.24.540 AM control file parallel
> write 1110465 WAITING 2 41 2
> 2 12-FEB-19 12.16.34.754 AM Disk file Mirror Read
> 1279827 WAITING 0 1 1
> 1 12-FEB-19 12.16.44.012 AM control file parallel
> write 1820977 WAITING 2 39 2
> 2 12-FEB-19 12.20.34.927 AM control file parallel
> write 1031042 WAITING 2 856 2
> 1 12-FEB-19 12.21.14.256 AM control file parallel
> write 1905266 WAITING 2 3 2
> 2 12-FEB-19 12.21.14.977 AM control file parallel
> write 1175924 WAITING 2 42 2
> 1 12-FEB-19 12.21.54.301 AM control file parallel
> write 2164743 WAITING 2 855 2
> 2 12-FEB-19 12.22.35.036 AM control file parallel
> write 1581684 WAITING 2 4 2
> 1 12-FEB-19 12.23.44.381 AM control file parallel
> write 1117994 WAITING 2 3 2
> 1 12-FEB-19 12.23.54.404 AM control file parallel
> write 4718841 WAITING 2 3 2
>
> Whe this happens we observe these cell metrics
>
> CELL METRICS SUMMARY
>
> Cell Total Flash Cache: IOPS=13712.233 Space allocated=6083152MB
> == Flash Device ==
> Cell Total Utilization: Small=27.8% Large=14.2%
> Cell Total Throughput: MBPS=471.205
> Cell Total Small I/Os: IOPS=9960
> Cell Total Large I/Os: IOPS=6005
>
> == Hard Disk ==
> Cell Total Utilization: Small=69.5% Large=18.7%
> Cell Total Throughput: MBPS=161.05
> Cell Total Small I/Os: IOPS=5413.618
> Cell Total Large I/Os: IOPS=166.2
> Cell Avg small read latency: 245.67 ms
> Cell Avg small write latency: 62.64 ms
> Cell Avg large read latency: 308.99 ms
> Cell Avg large write latency: 24.65 ms
>
>
> We cannot not enable write-back flash cache right now because that may
> cause another problems and although we are in process to upgrade 1/8 cells
> to 1/4 cells it is going to take some months. I know it is not a best
> practice but I was thinking in the mean time scarve some flash space and
> create them as grid disk and store the controlfiles in Flash. Anyone have
> experience with such setup?
>
> TIA
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Feb 13 2019 - 02:37:37 CET

Original text of this message