Re: wildcard search and full tablescans

From: hpuxrac <johnbhurley_at_sbcglobal.net>
Date: 12 Dec 2006 09:47:13 -0800
Message-ID: <1165945633.626633.244360@j72g2000cwa.googlegroups.com>

thomas.kyte_at_oracle.com wrote:
> DA Morgan wrote:
> > Charles Hooper wrote:
> >
> > > Paraphrasing from "Expert Oracle Database Architecture" by Tom Kyte:
> > > B*Tree indexes should be used only when retrieving a small portion of
> > > the rows in a table - less than 20% of the rows.
> >
> > Oh I know Tom wrote that and I suspect he regrets having done so. Not
> > because it isn't often correct ... but because often it is not correct.
> >
>
> Not at all - as was said "paraphrase" which could also be equated with
> "out of context"
>
> This was in a section where I was talking about indexes in general -
> and I actually used "somewhere between 1 to 20%" - but caveatted it
> heavily. You might use an index to retrieve EVERY row from a table
> (first rows optimization for example), but in general, you are using
> indexes to retrieve a relatively small percentage of the tables row.
>
> I never said this was a hard and fast number - never said "never use it
> for more than 20%". I don't really "regret" writing it as I did
>
> ....
> Here, only the index was used to answer the query-it would not matter
> now what percentage of rows we were accessing, as we would use the
> index only. We can see from the plan that the underlying table was
> never accessed; we simply scanned the index structure itself.
>
> It is important to understand the difference between the two concepts.
> When we have to do a TABLE ACCESS BY INDEX ROWID, we must ensure we are
> accessing only a small percentage of the total blocks in the table,
> which typically equates to a small percentage of the rows, or that we
> need the first rows to be retrieved as fast as possible (the end user
> is waiting for them impatiently). If we access too high a percentage of
> the rows (larger than somewhere between 1 and 20 percent of the rows),
> then it will generally take longer to access them via a B*Tree than by
> just full scanning the table.
>
> With the second type of query, where the answer is found entirely in
> the index, we have a different story. We read an index block and pick
> up many "rows" to process, then we go on to the next index block,
> and so on-we never go to the table. There is also a fast full scan we
> can perform on indexes to make this even faster in certain cases. A
> fast full scan is when the database reads the index blocks in no
> particular order; it just starts reading them. It is no longer using
> the index as an index, but even more like a table at that point. Rows
> do not come out ordered by index entries from a fast full scan.
>
> In general, a B*Tree index would be placed on columns that we use
> frequently in the predicate of a query, and we would expect some small
> fraction of the data from the table to be returned or the end user
> demands immediate feedback. On a thin table (i.e., a table with few or
> small columns), this fraction may be very small. A query that uses this
> index should expect to retrieve 2 to 3 percent or less of the rows to
> be accessed in the table. On a fat table (i.e., a table with many
> columns or very wide columns), this fraction might go all the way up to
> 20 to 25 percent of the table. This advice doesn't always seem to
> make sense to everyone immediately; it is not intuitive, but it is
> accurate. An index is stored sorted by index key. The index will be
> accessed in sorted order by key. The blocks that are pointed to are
> stored randomly in a heap. Therefore, as we read through an index to
> access the table, we will perform lots of scattered, random I/O. By
> "scattered," I mean that the index will tell us to read block 1,
> block 1,000, block 205, block 321, block 1, block 1,032, block 1, and
> so on-it won't ask us to read block 1, then block 2, and then block
> 3 in a consecutive manner. We will tend to read and reread blocks in a
> very haphazard fashion. This single block I/O can be very slow.
> ...............................
>
>
> > I can, for example, fiddle with optimizer_index_cost_adj and
> > optimizer_index_caching and make that number come out just about
> > anywhere I want.
> >
> > Remember Connor's demo of how he could dial in just about any hit ratio
> > he wanted? Somewhere I have one that does the same thing with B*Trees.
> > I'll see if I can find it.
> >
> > Regards
> > --
> > Daniel A. Morgan
> > University of Washington
> > damorgan_at_x.washington.edu
> > (replace x with u to respond)
> > Puget Sound Oracle Users Group
> > www.psoug.org

Here's the question as posed by the OP ...

Oracle10g 10.2.0.1

select * from <table_name> where <column_name> like '%abc%';

does not use the index and does a full tablescan (guess this is the expected behaviour as per explain plan).

Any suggestions as to how to use the index (index hint didn't help), IF

possible or other alternatives.

thanks Received on Tue Dec 12 2006 - 11:47:13 CST