Re: Index for very small tables needed ?

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Wed, 2 Jan 2002 11:40:30 -0000
Message-ID: <1009980560.1155.0.nnrp-13.9e984b29@news.demon.co.uk>

I think I've got to disagree with you, Howard.

I think the 'small table'/index issue exists only as a rule-based problem where an index could be used at the wrong time simply because it existed.

Under cost based, you really ought to tell the optimizer the truth if at all possible. Which means that indexes (particularly unique ones, of course) should be created if they are logically relevant.

Bear in mind, too, that the calculation of cost shouldn't be restricted to numbers of blocks. (Typically 2 for a small table by index, as the root tends to be the leaf).

In your example of a 48K table - which could be just 3 blocks - it would be perfectly reasonable to have a reference table of 100 byte rows - hence ca. 500 rows in such a table.

A query that joins a data table to that reference table could end up doing a nested loop full table scan against that table - at a surprisingly high CPU cost (and I have systems from time to time where this has been a real issue).

Interestingly - when you start examining the behaviour of Oracle 9, which introduced CPU costing - you find that the cost, and access paths of simple queries such as:

select * from t1 where col1 = 99;

change dramatically because of the inclusion of CPU costs. Simple tests show that the "break point" from an indexed path to a scan path can occur at dramatically different volumes of data because of the way that the MULTIPLE cheap cost of finding and testing lots of rows compares with the rare, but expensive cost of a block access.

For testing purposes, the hints CPU_COSTING and NOCPU_COSTING can be used with v9
to demonstrate the point.

--
Jonathan Lewis
http://www.jlcomp.demon.co.uk

Now running 3-day intensive seminars
http://www.jlcomp.demon.co.uk/seminar.html

Host to The Co-Operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Author of:
Practical Oracle 8i: Building Efficient Databases


Howard J. Rogers wrote in message
<3c2f7d62$0$2599$afc38c87_at_news.optusnet.com.au>...


>An index on a small table is likely to be a complete waste of time.  That's

>because the optimizer is likely to conclude, whenever access to a piece of

>that table's data is requested, that it would be just as quick to scan the

>whole table in and get the record from there.

>

>The reason *why* it might so conclude is that full table scans are

performed


>not one block at a time, but by grabbing multiple blocks of the disk in one

>pass.  Exactly how many blocks can be read like that depends on your

>hardware and O/S, and what your db_file_multiblock_read_count init.ora

>parameter is set to.  Usually, it's something like the equivalent of 64K or

>128K (but again, what it's set to ought to depend very much on what your

>hardware can actually perform).

>

>In contrast, if we were to use an index, we would be reading single blocks.

>And there would be at least 3 such single blocks to read -the root node of

>the index, the index leaf node itself, and the table block containing the

>row requested (assuming the smallest and simplest index structure

possible).


>Those three single blocks require 3 separate I/O operations.

>

>Assume your "very small table" is 48K in size, and your

>db_file_multiblock_read_count is set to 64K -that means the table can be

>read in its entirety in one multi-block I/O operation (a full scan).  Which

>do you think the optimizer would prefer to do: 1 I/O or 3?

>

>It's not true, either, to say that data read via a full tablescan is not

>read into the buffer cache.  All data is read into that cache.  The

question


>is more one of 'how quickly does it get flushed out again'?  Blocks read

via


>a full tablescan are placed around half-way down the LRU list, so that the

>moment they are read, they are well on their way to being candidates for

>being flushed out of the cache again, assuming we need the space.

>

>Your small table blocks, being read via a full tablescan, would likewise be

>candidates for an early flush.  But think about it: if any of those blocks

>are re-touched quite soon after initially being loaded into the cache, they

>will rapidly migrate up to the MRU end of the LRU list, and thus be

retained


>in the cache.  If they are not re-touched by a subsequent query or DML,

then


>fair enough -they stay toward the LRU end of the list, and are vulnerable

to


>flushing.  But their only at risk of being flushed because no-one is

>touching them -in other words, no-one else needs that data.  So why not

>flush them if they are of no use to anyone?

>

>If you are that concerned that someone else doing a large scan might flush

>your small scan blocks before you get a chance to re-use them, then you can

>consider the CACHE clause if you've got Oracle 7, or the rather more useful

>KEEP buffer if you've got Oracle 8.0 and above.  There's a couple of

>articles on both techniques on my web site.

>

>But the short story is: full tablescans do not bypass the buffer cache at

>all, so there's no intrinsic reason why your small table should need to be

>re-read with additional I/O.  However, the presence of an index on your

>'very small' table is likely to be a matter of supreme irrelevance to the

>optimizer, but will always slow down any DML that you perform to the table

>(because the index has to be maintained as the base table data changes).

>Short answer, therefore: don't index small tables.

>

>Regards

>HJR

>--

>----------------------------------------------

>Resources for Oracle: http://www.hjrdba.com

>===============================

>

>

>"Andreas Wizemann" <AWizemann_at_fvvag.de> wrote in message

>news:haml2ukh1eere5p9mcrp3k8jule26k6hn2_at_4ax.com...

>> Hi,

>> do i need to define an index on small tables ?

>> I'm confused after reading different articles about this.

>> One tells me i should define an index the other says there is no need,

>> as a full table scan is fast enough.

>> Also other articles say, oracle does not read rows in the buffer cache

>> while doing a Full table scan. So i think, next time this table will

>> be accessed there is another need of doing io.

>> So now, whats best ?

>> No Index with Full table scan and io's

>> or

>> Having an Index, so rows may stay in buffer ?

>>

>> Any Ideas ?

>> Thanks a lot.

>>

>> Andreas Wizemann

>

>

Received on Wed Jan 02 2002 - 05:40:30 CST