Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: os io unit , db block size relation

Re: os io unit , db block size relation

From: FM <fabrizio.magni_at_mycontinent.com>
Date: Thu, 16 Sep 2004 10:30:28 +0200
Message-ID: <41494F24.2070004@mycontinent.com>


Well, we seem interested on the same topic: oracle and direct i/o. I don't have all the answers to your question and the best person to solve the mistery could be Sébastien Godard (which I'm goign to contact soon).

As I can see you are using buffered reads (while you are not really writing anywhere) so, probably, same i/o are not physical but only logical. Even more the values are per second.

Iostat doesn't give you absolute value and I doubt it is the right instrument for what you want discover.

I'm interested as you so I'm going to contact systat developers for answers.

And thank to you I have a new, big, question for suse. Maybe there is a too old systat version for suse 9 (and SLES9) which doesn't support kernel 2.6 (default on those distribution).

New information soon (or I hope so).

Fabrizio

utkanbir wrote:

> Hi Fabrizio,
>
> Thank you very much for your help.
>
> I have made a small test:
>
> [oracle_at_tanidw1 tolga]$ time strace -T dd
> if=/oracle/ett/data/current/tanduyurus
> ure.txt of=/dev/null bs=10485760 count=5
>
> Here i try to read 5 10MB. blocks from a file which is stored in ext3
> file system.The file system is also part of raid10 array which is EMC
> . We use 960kb. of stripe size and have one io channel.
>
> Before running this command , i execute iostat -x /dev/sdm1 20 on
> another terminal .
>
> Here is the strace output :
>
> read(0, "DUYURU_KOD|UYE_ISYERI_KOD|SUBE_K"..., 10485760) = 10485760
> <0.594360>
> write(1, "DUYURU_KOD|UYE_ISYERI_KOD|SUBE_K"..., 10485760) = 10485760
> <0.000008>
> read(0, "42\n3133|2|33|485|20040914000000|"..., 10485760) = 10485760
> <0.409270>
> write(1, "42\n3133|2|33|485|20040914000000|"..., 10485760) = 10485760
> <0.000012>
> read(0, "1508|20040914000000|094029|8|1|1"..., 10485760) = 10485760
> <0.462430>
> write(1, "1508|20040914000000|094029|8|1|1"..., 10485760) = 10485760
> <0.000012>
> read(0, "124457\n896|497|1|3352|2004091400"..., 10485760) = 10485760
> <0.437568>
> write(1, "124457\n896|497|1|3352|2004091400"..., 10485760) = 10485760
> <0.000013>
> read(0, "1|2|99|1054|20040914000000|17451"..., 10485760) = 10485760
> <0.317013>
> write(1, "1|2|99|1054|20040914000000|17451"..., 10485760) = 10485760
> <0.000008>
> munmap(0x2000000000308000, 10534912) = 0 <0.000611>
>
> real 0m3.075s
> user 0m0.006s
> sys 0m0.088s
>
> And , iostat values :
>
> avg-cpu: %user %nice %sys %idle
> 253.93 0.00 12.25 757.83
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz
> await svctm %util
> sdm1 590.70 0.05 100.05 0.40 5526.00 3.60 55.05 6.81
> 67.80 1.51 15.53
>
>
> It is clear that the dd command reads 50MB. I want to see this in
> iostat command :
>
> iostat says 5526.00 sectors are read. So this is 2763 KB.
> (5526*521bytes )
>
> I dont know which part i missed? (May be i forgot the simple math !)
>
> And , iostat says , the system makes 100 read calls per second . This
> is 55 sectors per read . This 100 read call is not what i issued.
> Since strace output shows read function calls , i consider the os
> counts the number of this read calls as io but i think it is not
> correct. What exactly does iostat count as read call?
>
>
> I will be appreciated if you can help me about the issue.
>
> Kind Regards,
> tolga
>
>
>
>
>
>
>
>
>
>
> FM <fabrizio.magni_at_mycontinent.com> wrote in message news:<4146fd2d$1_at_x-privat.org>...
>

>>utkanbir wrote:
>>
>>
>>
>>>statfs call against the filesystems which oracle data resides returns
>>>512 as the optimal block size.
>>>
>>
>>On ocfs you are not buffering so, I believe, the value you read in 
>>f_bsize is the physical block (512 byte).
>>(You are accessing the DMA "layer" directly).
>>
>>
>>>When i create a directory in one of the ocfs file systems , its size
>>>is 32K.
>>>
>>>sar -d reports some values in blks/s . 
>>>
>>>I want to connect all these info but i cant. Here is my questions:
>>>
>>>1. How can i find the os io unit ? How much data does (not only in blocks ,
>>>but also in bytes) the os transfer in each read ?
>>>
>>
>>You are stuck to PAGE_SIZE for buffered i/o: 4k for x86 architectures 
>>(even x86-64), 16k (by default) on ia64.
>>
>>
>>>2. Since i use ocfs which use direct io (no os buffering) , can i say
>>>that all database ios are 16kb? (Since this is db block size) Or are
>>>all ios issued by db server 512 bytes?(Since statfs returns 512 as the
>>>optimal block size)Or are all ios issued by db server 32kb. (this is
>>>the ocfs block size)
>>>
>>
>>Since it is not buffered you are not using the virtual memory machine 
>>(or so I read) so every oracle i/o is made in a single operation.
>>(Even the read-ahead is not triggered).
>>
>>
>>>3. I want to monitor  the raid subsytem by comparing the disk io done
>>>by dbserver and os. I use v$filestat oracle table to get the number of
>>>disk reads in  block , than sar -d or iostat but how can i understand
>>>what os block exactly mean in sar and iostat outputs? 16kb, 512bytes
>>>or 32kb?
>>>
>>>
>>
>>sar and iostat show you reads and writes as sector units so 512 bytes 
>>(kernel from 2.2 on).


-- 
Fabrizio Magni

fabrizio.magni_at_mycontinent.com

replace mycontinent with europe
Received on Thu Sep 16 2004 - 03:30:28 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US