Re: physical addresses of data blocks

From: Noons <nsouto_at_optusnet.com.au.nospam>
Date: 25 Feb 2003 12:24:37 GMT
Message-ID: <Xns932DEB9BC3EA8Tokenthis@210.49.20.254>

Following up on Rachel Wilson, 25 Feb 2003:

> contigeous - aren't - because of how they are allocated by the OS.
> (you cruel cruel people for dropping this bombshell :) )

sorry... ;)

>
> It made me wonder if there is a store of the actual pysical address of
> each block on the disk controler/disk/partition etc held by Oracle.
> If so, where? If the lookup is done using the rowid and the OS
> resolves the datafile number etc and knows where to look for it, is
> there an index of all the rowid's and which segment they relate to?

OK, I'll give it a go. I'm sure others will chip-in as well. I'll keep it as simple as possible.

A few mixed up concepts right there, that's why it appears confusing. The "relating" you talk about above is not necessarily needed. It really can be quite simple.

First of all:

Oracle stores rows in datafiles. These are nothing more than adressable areas of disk(s). As far as the rowid is concerned, it tells Oracle which file to read from and which block (yes, the Oracle block defined in db_block_size) in that datafile to read the row from.

That's it as far as Oracle is concerned. Nothing more, nothing less. (well from 8i onwards a little bit more, but we won't go there now)

Second:

A datafile used by Oracle may be of two types. A "raw" datafile or a "cooked" datafile. These terms have their origins in Unix and mean respectively a datafile that the OS doesn't manage anything at all about and a datafile whose disk space is managed by the OS as part of a thing called a "file system".

Typically, Oracle will request an I/O operation of the OS, consisting of a read or write order for a given number of blocks from a datafile.

Third:

If the datafile is a raw one, the OS will give up straight away and pass the request to the logical volume manager or the device driver and let it worry where on the disk to go to to write or read the data. That "where on the disk" is called a partition and the LVM and the device driver know how to find it in the middle of all those platters.

Bottom line: very little overhead from the OS. But a little more work needed from the DBA and the sysadmin to replace the work done on their behalf by the OS.
This work used to involve all sorts of weird magic techo stuff, like working out disk partition maps and stuff like that. With a Logical Volume Manager - LVM for short - (a layer of software between the OS and the device driver) this is tremendously simplified: literally, only the most "button-challenged" sysadmin/dba will find it hard to work out how to keep things organized. Of course, if they are using Windows where a LVM doesn't exist, things WILL go "clunk". But that's expected.

Fourth:

If the datafile is "cooked", it means the OS (through a layer of software called the "file system") will be able to figure out exactly where on the disk(s) to go to. In order to do this, the OS keeps in the same disk (or in a central directory) tables that describe where cooked datafiles are and how long they are and their makeup and how many there are in a file system and where, which are in use, yadda yadda. But it also does a lot more than this. It actually does cache some of the blocks of the datafile. And those file system tables.

And it will do things like try to optimize the block size based on the usage pattern and the type of datafiles involved. It will also organize many datafiles into what are called directories and keep many of the former in few of the latter, in what is called a "file system". The file system in turn will be kept in one or more partitions of the disk. Yes, you guessed it: a partition is one of those disk areas used by Oracle directly when using raw datafiles.

IOW, the file system is doing very much the same thing Oracle already does in terms of organizing data into clumps and clusters (tables in tablespaces in Oracle) that can be found and read/written a lot faster.

It also does things that Oracle doesn't necessarily need. Like, it groups together blocks from many datafiles into the same area(s) of disk for convenience of access. This is highly desirable if you are dealing with Unix datafiles, but is lethal for proper database performance. It can cause an apparently linear scan to suddenly go all over the disk looking for data.

That's why you hear people saying that file systems add hidden overhead to Oracle. They do, they are doing a lot of the work that Oracle does. And a lot more that Oracle doesn't need or want.

Nothing wrong with that, if you are not using a database. If you are and you want to get top performance in your I/O, you have to take some of that responsibility away from the OS and assume it yourself.

That's where and why you get into the raw datafile and LVM areas.

And that's why you can get unexpected I/O patterns on a seemingly flat landscape.

HTH.

-- 
Cheers
Nuno Souto
nsouto_at_optusnet.com.au.nospam

Received on Tue Feb 25 2003 - 06:24:37 CST