Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: What block size are you using for your new 9i data

Re: What block size are you using for your new 9i data

From: Danisment Gazi Unal <dunal_at_ubTools.com>
Date: Fri, 26 Apr 2002 13:48:47 -0800
Message-ID: <F001.004511C6.20020426134847@fatcity.com>


Hi Gaja,

Once again I've not tested, but I've some questions about your comments on "physically contiguous" and "Keep DB_BLOCK_SIZE = FS(or OS) Block Size"

"physically contiguous":

We know that disk sectors are read and then transfered to bus. There will be a delay while transfering read sectors to bus, but before the reading next sector. Since disk rotates, while transfering current sector to bus, some or all of the next physically contiguous sector may be missed. If these sectors were really physically contiguous, OS would wait for the next rotation of the disk to read entire next physically contiguous sector. So, OS doesn't put logically contiguous sectors as physically contiguous. By depending on disk rotation speed and transfer speed to bus, it's scattered to disk. An optimized disk management system finds next sector immediately after the current sector is transfered to bus. This may be done by putting gaps between logically contiguous sectors. These gaps may be used for other data. Of course, there may be different implementation, but there will be always a delay to bus and there will be always a miss to next sector(s).

I've not tested but If Oracle sequential data is stored as physically contiguous, it's a real problem for IO subsystem. I guess it's logically contiguous.

"Keep DB_BLOCK_SIZE = FS(or OS) Block Size":

As I remember(???), Oracle uses bytes as parameters in IO system calls. And, let's say we created a db which has DB_BLOCK_SIZE = FS/OS block size. is it guarantee that each new Oracle block will be written to new OS block ? Every file is identified by a file handle in OS level, and also there should be a specific value in a register which points to last offset of the file. I mean, next insert may be appended to current OS block if there is free space, and new block(s) may be allocated for the remainings. Here is a sample:

1K is appended to last OS block, a new OS block is allocated for the remaining 2K, another new OS block is allocated for the remaining 1K. In this sample, 4K is scattered to 3 blocks, not 2 blocks.

I think this will not be a problem for Oracle. Because Oracle uses its own format. The check between block header and tail will prevent scattered data from any corruptions in physically different blocks.

I've not tested them, I may be wrong. Looking forward to hearing a confirmation

regards....

Gaja Krishna Vaidyanatha wrote:

> Hi Bill & list,
>
> The main function of a "read ahead algorithm" is to
> anticipate the nature of I/O requests on a given track
> of a disk's platter and see whether it is beneficial
> to "pre-fetch" some of the blocks, so that subsequent
> requests can be serviced from the either the
> controller's or the file system's cache, without
> having to "go to disk" multiple times.
>
> The OS (or the sub-system) should normally return only
> the same number of blocks as requested. But, if there
> are multiple "read requests" from Oracle that are
> physically contiguous on disk and they also occur in a
> rapid succession, the OS or the I/O sub-system (as the
> case may be), "second guesses" the requestor's intent
> and assumes that more of the other blocks in the same
> track will also be requested, in the near future.
>
> For a real "sequential scan", like in a full-table
> scan or an index fast-full scan, this is beneficial.
> But in the case of a range scan where only "a few
> contiguous blocks" are requested, pre-fetching 128K or
> 256K worth of data is wasteful use of a system's I/O
> resources. This is because, not all the blocks that is
> pre-fetched will be consumed.
>
> The issue of an 8K DB_BLOCK_SIZE with say a 512-byte
> File System (or OS) Block Size, is that there is a
> 1-is-to-16 ratio between logical and physical blocks.
> So, for example if 4 Oracle blocks are requested, they
> translate into 64 FS (or OS) blocks. If these blocks
> are contiguous (and chances are good that leaf blocks
> in an index can be contiguous ), it becomes an "ideal
> condition" for the read-ahead algorithm to engage. So
> instead of servicing 32K of data, the sub-system
> retrieves 128K or 256K worth of data.
>
> And, even if you have a 1-is-to-2 ratio between
> logical and physical blocks (DB_BLOCK_SIZE is 8K and
> FS Block Size is 4K), under the "right conditions",
> the read-ahead algorithm will engage and pre-fetch in
> a wasteful manner. So the bottom line is follows:
>
> Keep DB_BLOCK_SIZE = FS(or OS) Block Size
>
> This way, if Oracle requests for a few blocks in a
> track, the OS does not pre-fetch all of the blocks in
> the track. As mentioned before, in case of a "real
> sequential scan", the pre-fetch comes in goodstead.
>
> Hope that helps,
>
> Gaja
>
> --- Bill Buchan <wbuchan_at_uk.intasys.com> wrote:
> >
> > Sorry, I'm a bit non-clued up on this "read ahead
> > algorithm". Could I be a
> > pain and ask for more details? Does the OS return
> > one OS block if exactly
> > one is requested, but if 2 are requested it thinks
> > "aha! sequential scan"
> > and goes and gets 4 or 8 or something?
> >
> > The follow on is, does this mean you should use a
> > (minimal) 2k block size
> > on UFS, 512 bytes blocks, or is this read-ahead
> > overhead a smaller
> > performance hit than that of using a database block
> > size which is too small
> > for the application?
> >
> > Thanks
> > - Bill.
> >
> >
> > At 08:48 26/04/02 -0800, you wrote:
> > >All,
> > >
> > >You always want to ensure that your DB_BLOCK_SIZE =
> > >File System Block Size. This is to avoid wasted I/O
> > >and also the case where the "read ahead algorithm"
> > is
> > >triggered accidentally, when 1 Database Block
> > results
> > >in multiple file system blocks being read from
> > disk.
> > >
> > >If your application performs range scans, there is
> > a
> > >high possibility that multiple "single database
> > block"
> > >read requests to a set of contiguous blocks, may
> > >result in the "read ahead algorithm" performing
> > 128K
> > >or 256K pre-fetches, even though your application
> > may
> > >have not required all 128K or 256K.
> > >
> > >This problem is rampant on ufs file systems where
> > the
> > >default block size is 512 bytes, and with a 8K
> > >DB_BLOCK_SIZE, it takes 16 file system blocks to
> > store
> > >1 DB block on disk. However, even if you have
> > advanced
> > >file systems and have a 1-is-to-2 ratio of DB block
> > >is-to FS blocks, you are still in danger of
> > >overloading your I/O sub-system, "under the right
> > >conditions".
> >
> > --
> > Please see the official ORACLE-L FAQ:
> > http://www.orafaq.com
> > --
> > Author: Bill Buchan
> > INET: wbuchan_at_uk.intasys.com
> >
> > Fat City Network Services -- (858) 538-5051 FAX:
> > (858) 538-5051
> > San Diego, California -- Public Internet
> > access / Mailing Lists
> >
> --------------------------------------------------------------------
> > To REMOVE yourself from this mailing list, send an
> > E-Mail message
> > to: ListGuru_at_fatcity.com (note EXACT spelling of
> > 'ListGuru') and in
> > the message BODY, include a line containing: UNSUB
> > ORACLE-L
> > (or the name of mailing list you want to be removed
> > from). You may
> > also send the HELP command for other information
> > (like subscribing).
>
> =====
> Gaja Krishna Vaidyanatha
> Director, Storage Management Products,
> Quest Software, Inc.
> Co-author - Oracle Performance Tuning 101
> http://www.osborne.com/database_erp/0072131454/0072131454.shtml
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Games - play chess, backgammon, pool and more
> http://games.yahoo.com/
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.com
> --
> Author: Gaja Krishna Vaidyanatha
> INET: oraperfman_at_yahoo.com
>
> Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
> San Diego, California -- Public Internet access / Mailing Lists
> --------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).

--
Danisment Gazi Unal
http://www.ubTools.com



-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Danisment Gazi Unal
  INET: dunal_at_ubTools.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Fri Apr 26 2002 - 16:48:47 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US