Oracle 12.1.0.2, RHEL 7 and XFS issue

From: Uwe Küchler <uwe_at_kuechler.org>
Date: Thu, 30 Jul 2015 21:14:48 +0200
Message-ID: <61b5e45bbd2789160f5f62c265701433.webmail_at_mx1bln1.prossl.de>



Dear fellows of the Oracle,

From Red Hat and Oracle Linux 7 onwards, XFS is the default file system of the OS.
At a customer site, XFS was already the preferred file system, so the customer chose to stick to it for a new VM with OL7 and Oracle 12.1.0.2.

But, while testing the migrated database against the old one, most of the batch jobs showed a slowdown to at least twice the run time than in the 11.1 environment.

Both statspack reports showed clearly that "db file sequential read" was by far the main wait event.
Top SQL and their explain plans did not differ between the environments.

Research took a while, but to get to the point: It boiled down to the I/O response times, as shown in the wait event histogram excerpts below:

With

- 24 GiB RAM
-  5 GiB sga_target
-    Buffer Cache:     4,592M
- "filesystemio_options=NONE":
                           Total ----------------- % of Waits
------------------
Event                      Waits  <1ms  <2ms  <4ms  <8ms <16ms <32ms  <=1s

>1s
-------------------------- ----- ----- ----- ----- ----- ----- ----- -----
-----
db file scattered read      836   89.7   1.6    .8   1.8   3.1   1.9   1.1
db file sequential read     121K  83.8    .7    .8   3.9   6.1   2.8   1.9

   .0

80-90% of those waits < 1ms?
This can most certainly be attributed to file system caching (no Flash Cache, SSD or other smart stuff in place here).

  • 24 GiB RAM
  • 8 GiB sga_target
  • Buffer Cache: 5,856M
  • With "filesystemio_options=SETALL": Total ----------------- % of Waits
    Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s
    >1s
    -------------------------- ----- ----- ----- ----- ----- ----- ----- -----
    db file scattered read 63K 42.6 1.9 3.2 19.9 26.0 4.0 2.4 db file sequential read 208K 47.1 1.6 3.7 19.0 21.9 4.5 2.3 .0

In other batch job runs, the amount of waits < 1 ms was even lower (some 30%).
As you can see, I made the SGA / the buffer cache bigger in the 12c environment, to allow for more buffering within the SGA.

Of course, I checked MOS for any known issues with Direct I/O on XFS in this constellation, but haven't found anything so far. Just the usual recommendations to avoid double buffering and also the confirmation that XFS is capable of doing direct I/O.

And now for my
QUESTION(s):



Do you know of any issues with XFS on Linux 7 with direct I/O? Do you have any suggestions how to further track down the issue? E.g., how could I prove there's something wrong with the O_DIRECT calls?

Thanks for your time.
Uwe

P.S.: On (a hundred-and-) second thought I could try to enlarge the buffer cache even more, as there's enough RAM left. At least for the tests.

P.P.S.: In case Kevin Closson reads this: I am eagerly awaiting your upcoming blog article on XFS!

---
http://oraculix.com


--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jul 30 2015 - 21:14:48 CEST

Original text of this message