Re: One Big Extent vs Multiple Extents

From: Bruce Pihlamae <pihlab_at_cbr.hhcs.gov.au>
Date: 16 Nov 94 09:59:23 +1000
Message-ID: <1994Nov16.095923.1_at_cbr.hhcs.gov.au>


In article <brosalesCz9unG.Er8_at_netcom.com>, brosales_at_netcom.com (Bob Rosales) writes:
> We have always tried to compute the first extent of
> our tables to fit all data, if possible.
>
> But now with striping, is there advantage to do
> so at all? I have heard that actually multiple
> extents are better? Can some one suggest any
> paper or your thoughts on this issues?

I can feel a FLAME fest coming on ... what the heck ...

On extents:


We also manage our databases to keep objects (all objects) to ONE extent.

Every 6 months we review each database and determine growth rates etc and do a complete resize of each object that looks like it will extend itself in the next 6 months.

I now believe that this work we have generated for ourselves is largely a waste of time. NOT because keeping objects as 'tight' as possible doesn't help overall database performance BUT simply because it doesn't help enough to make the extra work cost effective.

Set a limit for your site/database. If object ABC reaches XYZ extents then resize it to 1 extent and then let it start growing again. If it also seems to be extending too often then increase the NEXT setting when next you resize it.

If you have the option of building in a regular 'culling' mechanism for your applications then your objects tend to stabilise in size and your INITIAL extent should be sized to hold that volume of data.

I am now working towards setting a 'reasonable' INITIAL extent and setting the NEXT extent so that the object doesn't need to extend more than once a month.

On the performance side:


Under version 6 we had the Data Dictionary Cache as a separate SGA area and we did balancing acts to ensure minimal misses for this cache. Under version 7 this cache is now dynamicly extendable and not directly 'tweakable' so Oracle does it's own dynamic tuning.

By having an object in ONE extent you would have one entry in this cache indicating which blocks the object resides in. By having MULTIPLE extents you will have multiple entries. The expense is that you use up a little extra SGA space and a little extra cpu to step through those multiple entries to find the right block set.

This performance overhead is really only a problem on FULL TABLE SCANS and I doubt you would be able to measure the impact.

Normal activity of accessing via an index is random and which extent the row is actually in is not relevant to the performance of your 2 I/Os.

On the disk I/O side:


If you are the only person using the database and you access an object via full table scan then you MAY notice the slight delay as the disk heads reposition to the next extent's location.

If you are just one of 20 requests to that disk then between your first read and your second request the device will be jittering its head all over the place serving other user requests. It won't matter that your second read was right next to your first because your now 500 tracks away from there by the time the disk gets back to serving your request.

Some of the rows you want could already be in the SGA so you may have to incur head movement to jump over blocks WITHIN an extent.

There is no guarantee where your tablespaces are stored on a disk so two users hitting two different tablespaces on the same disk could be causing masive head movements anyway.

On a high activity disk you encounter different performance characteristics (and scheduling algorithms) depending upon the load. There are a number of head movement or I/O scheduling algorithms used for satisfying a set of I/O requests to a disk. The choice depends upon the load on the device. If it is not very busy then it will do each request in the order in which they arrive. Once the disk gets busy it can go into 'elevator' mode and simply move the heads back and forth across the disk surface and satisfy any requests as it passes over the relevant tracks. There are many other methods used and I am not involved in this sort of work so my explanations may be an extreme over simplification (or totally wrong) so someone else can correct me appropriately.

On striping:


Very useful and we do use it. We use the hardware version where we create a logical disk out of 2+ disks and then let the disk controller interleave the blocks across the devices for maximum throughput. We haven't bothered to use the example in the Oracle texts of creating a tablespace of many files and then striping the data manually across the tablespace mainly because our users tend to do most of their work in the most recently entered rows. Using the manual method simply means that ONE file of the tablespace (and the disk it resides on) gets hit the most. By using the hardware method the load is automaticly spread over all devices involved.

We are using VAX/VMS's disk striping.

How do multiple extents affect striping. It doesn't. The manual method requires you to have multiple extents ; that's how you implement it. The hardware method is transparent ; it's just a bigger disk to us.

On disk cacheing:


Install disk cacheing on your software and application program disks.

If you have caching on your disk controller card then enable it to reduce I/O as much as possible. Always be aware that load moved off of your disks will more than likely start to cause processes to queue for cpu usage.

If you have caching software that uses computer memory then only use it on your software and application program disks. Increase your database buffers and your SGA rather than use a computer memory cache for caching database disks. The SGA database buffers cache is a much better use of computer memory for cacheing databases.  

On Disk Defragmenters:


I know you didn't mention them but I noticed something interesting that I thought I would mention anyway.

The last convention I attended they were talking about disk defragmenters that not only defragment your disk by moving files around to give maximum contiguous free space BUT also relocate your 'hot' files on the disk so that you incur minimal head movement to move between 'hot' files.

This sort of 'hot' file fragmenter is not very useful to Oracle databases which have large tablespaces with lots of data in each. It does benefit MANY small tablespaces and the best possible benefit is gained by having each tablespace made up of MANY files. When you have a large tablespace you have 80% of it rarely touched and 20% is 'hot'. If you build your tablespace as 10 files then 7 might be low activity and 2 high and 1 very high activity. Such a disk defragmenter would then be able to relocate the 'hot' files of your tablespace for optimal throughput. There is of course a maintenance overhead involved in having multiple file tablespaces but you again need to weigh the cost against the benefit.

I think I've droned on enough.

I would like to hear from others on these topics ....  

-- 

Bruce...        pihlab_at_cbr.hhcs.gov.au

*******************************************************************
* Bruce Pihlamae  --  Database Administration                     *
* Commonwealth Department of Human Services and Health            *
* Canberra, ACT, Australia                        (W) 06-289-7056 *
*=================================================================*
* These are my own thoughts and opinions, few that I have.        *
*******************************************************************

"The more complex the argument gets, the easier it is to refute."
"Killing is wrong!"  -- Trent 'The Uncatchable' Castanaveras
Received on Wed Nov 16 1994 - 00:59:23 CET

Original text of this message