Re: Old linux lore no longer applies

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Thu, 22 Mar 2012 21:05:41 +0000 (UTC)
Message-ID: <pan.2012.03.22.21.05.41_at_gmail.com>



On Thu, 22 Mar 2012 00:50:10 +0000, Mladen Gogala wrote:

> There is an interesting discussion on Oracle Support Community:
> http://tinyurl.com/7jfdet9. The interesting part is about the long
> standing rule that db_file_multiblock_read_count should not exceed the
> size of a single IO. On the old Linux systems, the IO limit used to be
> 1M. There were Linux kernel parameters like MAXPHYS and MAX-IO-SIZE that
> used to determine the maximum IO size.
> Those parameters are gone in the newer Linux kernels (tried with CentOS
> 6.2 and Fedora 15) as well as the 1MB limit. The limit no longer
> applies:
>
> I executed the following:
>
> strace -o /tmp/out dd if=/dev/zero of=/tmp/ttt bs=4M count=2
>
> Linux didn't break the writes into multiple system calls, it executed
> the 4MB write calls without breaking the sweat:
>
> pen("/tmp/ttt", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3 dup2(3,
> 1) = 1 close(3)
> = 0 clock_gettime(CLOCK_MONOTONIC, {1186, 956603209}) = 0 mmap2(NULL,
> 4206592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xb717c000 read(0,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0"...,
> 4194304) = 4194304 write(1,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0"...,
> 4194304) = 4194304 read(0,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0"...,
> 4194304) = 4194304 write(1,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 \0"...,
> 4194304) = 4194304 close(0) = 0 close(1)
>
> With the new FC HBA adapters, which contain significant amount of
> memory,
> it makes sense to use much larger db_file_multiblock_read_count. I saw
> performance improvement with setting it up to 16M on the machine in the
> office (8MB HBA on the DW machine).
>
> The old lore definitely doesn't apply any longer.

I tested 64bit Oracle 11.2 on Linux and db_file_multiblock_read count was set to 32,128 and 512. Oracle always reads 128 blocks. This parameter is no longer modifiable, it is hard-wired to 1MB, and it even says something to that effect in the reference manual, although it isn't at all clear what does it mean. The reference manual must have been written by a professional politician:

"DB_FILE_MULTIBLOCK_READ_COUNT is one of the parameters you can use to minimize I/O during table scans. It specifies the maximum number of blocks read in one I/O operation during a sequential scan. The total number of I/Os needed to perform a full table scan depends on such factors as the size of the table, the multiblock read count, and whether parallel execution is being utilized for the operation.

As of Oracle Database 10g release 2, the default value of this parameter is a value that corresponds to the maximum I/O size that can be performed efficiently. This value is platform-dependent and is 1MB for most platforms.Because the parameter is expressed in blocks, it will be set to a value that is equal to the maximum I/O size that can be performed efficiently divided by the standard block size. Note that if the number of sessions is extremely large the multiblock read count value is decreased to avoid the buffer cache getting flooded with too many table scan buffers.

Even though the default value may be a large value, the optimizer will not favor large plans if you do not set this parameter. It would do so only if you explicitly set this parameter to a large value.

Online transaction processing (OLTP) and batch environments typically have values in the range of 4 to 16 for this parameter. DSS and data warehouse environments tend to benefit most from maximizing the value of this parameter. The optimizer is more likely to choose a full table scan over an index if the value of this parameter is high.

The maximum value is the operating system's maximum I/O size expressed as Oracle blocks ((max I/O size)/DB_BLOCK_SIZE). If you set this parameter to a value greater than the maximum, Oracle uses the maximum."

So, it is 1MB for most of platforms, on platform basis. Unsurprisingly, Oracle isn't using the same IO method as dd, they're using io_submit to submit IO and io_getevents to poll for the results:

 io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7828, 0x2b5a3a9b7828, 1048576, 0}}, {600, 0}) = 1

io_submit(47666528870400, 1, {{0x2b5a3a9b7828, 0, 0, 0, 257}}) = 1
io_submit(47666528870400, 1, {{0x2b5a3a9b7e00, 0, 0, 0, 257}}) = 1
io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7828, 0x2b5a3a9b7828, 
1048576, 0}}, {600, 0}) = 1
io_submit(47666528870400, 1, {{0x2b5a3a9b7828, 0, 0, 0, 257}}) = 1 io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7e00, 0x2b5a3a9b7e00, 1048576, 0}}, {600, 0}) = 1
io_submit(47666528870400, 1, {{0x2b5a3a9b7e00, 0, 0, 0, 257}}) = 1 io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7828, 0x2b5a3a9b7828, 1048576, 0}}, {600, 0}) = 1
io_submit(47666528870400, 1, {{0x2b5a3a9b7828, 0, 0, 0, 256}}) = 1 io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7e00, 0x2b5a3a9b7e00, 1048576, 0}}, {600, 0}) = 1
io_submit(47666528870400, 1, {{0x2b5a3a9b7e00, 0, 0, 0, 256}}) = 1 io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7828, 0x2b5a3a9b7828, 1048576, 0}}, {600, 0}) = 1
io_submit(47666528870400, 1, {{0x2b5a3a9b7828, 0, 0, 0, 256}}) = 1 io_getevents(47666528870400, 1, 128, {{0x2b5a3a9b7e00, 0x2b5a3a9b7e00, 1048576,

The result was the same, regardless of whether I set the db_file_multiblock_read_count to 32,128 or 512. My brothers in arms, this is no longer a parameter. This brings back memories, since the time I have become the consultant of swing.

-- 
http://mgogala.byethost5.com
Received on Thu Mar 22 2012 - 16:05:41 CDT

Original text of this message