Re: separate data/inidex

From: Daniel Morgan <damorgan_at_exesolutions.com>
Date: Mon, 22 Apr 2002 23:03:08 GMT
Message-ID: <3CC4969D.73F557F0@exesolutions.com>

I should have commented on that fact ... that you were requesting more than 30% of the data in the table. That has a huge potential impact.

Obviously in a table with 100,000,000+ rows only someone crazier than I would try a stunt like that. And if they do I hope they use the +ALL_ROWS hint.

In my experience queries like that are generally scheduled for 2:00am on a Sunday. The more general query that returns all rows, or a large percentage of rows, is generally one from a look-up table.

So at the risk of rubbing salt into a wound ... it appears that everyone counseling not separating tables and indexes is avoiding providing a replacement rule of thumb. Why no takers? ;-) What is the verbiage you think would be valid at technet and tahiti as advice for developers and DBAs?

Daniel Morgan

Connor McDonald wrote:

> Daniel Morgan wrote:
> >
> > With 10,000 rows I'd not argue with you. In fact below 40,000 rows I'd not argue
> > with you. At 10,000 rows the only index I would create would be one created by a
> > required constraint. But try that same experiment with the 1.4 million parts in a
> > 747, understanding that every part may be in there multiple times (for example at
> > one time there was a choice 21 different pilot clipboards). Then add the sub-models
> > (different seating capacity, different galleys, etc.) then add 717, 737, 757, 767,
> > and 777 models.
> >
> > You are correct with the example you gave. But I can't believe that with
> > 100,000,000+ rows you would ever find this to be the case. Am I wrong?
> >
> > Daniel Morgan
> >
> > Connor McDonald wrote:
> >
> > > Daniel Morgan wrote:
> > > >
> > > > I agree. I thought he was saying that most of the time we do full table scans,
> > > > etc. And given that many of my application tables have been in the
> > > > multi-gigabyte range that would be a complete non-starter.
> > > >
> > > > When one is looking for a single part of a Boeing 747 ... one does not want to
> > > > scan every part for every plane ever made.
> > > >
> > > > Daniel Morgan
> > > >
> > > > Jonathan Lewis wrote:
> > > >
> > > > > Comments in-line
> > > > >
> > > > > --
> > > > > Jonathan Lewis
> > > > > http://www.jlcomp.demon.co.uk
> > > > >
> > > > > Author of:
> > > > > Practical Oracle 8i: Building Efficient Databases
> > > > >
> > > > > Next Seminar - Australia - July/August
> > > > > http://www.jlcomp.demon.co.uk/seminar.html
> > > > >
> > > > > Host to The Co-Operative Oracle Users' FAQ
> > > > > http://www.jlcomp.demon.co.uk/faq/ind_faq.html
> > > > >
> > > > > Daniel A. Morgan wrote in message <3CC2CB08.64C747AF_at_exesolutions.com>...
> > > > > >Please help me here with two things.
> > > > > >
> > > > > >1. "Nowadays to satisfy queries, you're full scanning tables, full scanning
> > > > > indexes,
> > > > > >not using indexes at all,...."
> > > > > >
> > > > >
> > > > > It is very easy to misunderstand the intent of a comment
> > > > > on the newsgroup, but I think if you re-read Connor's post
> > > > > carefully you will appreciate that he is not saying "you must
> > > > > full scan tables"; he is simply listing a number of the mechanism
> > > > > that are commonly (and necessarily) activated in a modern system.
> > > > >
> > > > > >Since when? I work very hard to not do these things.
> > > > > >
> > > > >
> > > > > Don't work too hard at it. Sometimes the biggest
> > > > > performance problems are introduced by trying
> > > > > to push Oracle too hard into paths which use
> > > > > index range scans and index unique scans to
> > > > > access tables by rowid; resulting in excessive
> > > > > CPU usage and CBC latch contention.
> > >
> > > I'll give an example of what I was trying to convey in terms of "full
> > > scans good sometimes, full scans bad sometimes" using an example
> > > somewhat poached from JL's excellent seminar
> > >
> > > create table BOEING1 ( x number, y char(100));
> > > create table BOEING2 ( x number, y char(100));
> > > create index BOEING1 on BOEING1 ( x);
> > > create index BOEING2IX on BOEING2IX ( x);
> > >
> > > 10000 rows inserted into each table, 200 rows for each value of x from 0
> > > to 49. So after running an analyze...
> > >
> > > analyze table BOEING1 compute statistics;
> > > analyze table BOEING2 compute statistics;
> > >
> > > we can see that they are virtually identical..
> > >
> > > select table_name, num_rows, blocks, empty_blocks
> > > from user_tables;
> > >
> > > TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
> > > ------------------------------ ---------- ---------- ------------
> > > BOEING1 9999 153 6
> > > BOEING2 9999 153 6
> > >
> > > select table_name, column_name, NUM_DISTINCT,DENSITY
> > > from user_tab_columns
> > > where TABLE_NAME like 'TE%';
> > >
> > > TABLE_NAME COLUMN_NAME
> > > NUM_DISTINCT DENSITY
> > > ------------------------------ ------------------------------
> > > ------------ ----------
> > > BOEING1 X
> > > 50 .02
> > > BOEING2 X
> > > 50 .02
> > >
> > > select max(x), min(x), avg(x) from BOEING1;
> > >
> > > MAX(X) MIN(X) AVG(X)
> > > ---------- ---------- ----------
> > > 49 0 24.5024502
> > >
> > > select max(x), min(x), avg(x) from BOEING2;
> > >
> > > MAX(X) MIN(X) AVG(X)
> > > ---------- ---------- ----------
> > > 49 0 24.5024502
> > >
> > > But they exhibit spectacularly different performance characteristics...
> > >
> > > select /*+ FULL(t) */ count(y) from BOEING2 t
> > > where x < 20;
> > >
> > > took 156 consistent gets
> > >
> > > select /*+ INDEX(t) */ count(y) from BOEING2 t
> > > where x < 20;
> > >
> > > took 70 consistent gets,
> > >
> > > thus you get the "conventional" result of "we should be using an index"
> > > when examing this particular BOEING. But the same queries against
> > > BOEING1 give
> > >
> > > select /*+ FULL(t) */ count(y) from BOEING1 t
> > > where x < 20;
> > >
> > > took 156 consistent gets
> > >
> > > select /*+ INDEX(t) */ count(y) from BOEING1 t
> > > where x < 20;
> > >
> > > took 3120 consistent gets (20 times worse!)
> > >
> > > So - sometimes checking every Boeing ever made is the best option :-)
> > >
> > > Cheers
> > > Connor
> > >
> > > --
> > > ==============================
> > > Connor McDonald
> > >
> > > http://www.oracledba.co.uk
> > >
> > > "Some days you're the pigeon, some days you're the statue..."

> The example I displayed is somewhat table size independent - that is,
> the queries are requesting approx 40% of the table rows. The example
> shows that in some cases, the index is easily the best option, in other
> cases, the table scan is easily the best.

> If you had 100,000,000 rows it wouldn't matter - if the question
> requires probing 40% of the rows, then the index vs scan decision still
> needs to be evaulated

> Cheers
> Connor
> --
> ==============================
> Connor McDonald

> http://www.oracledba.co.uk

>
> "Some days you're the pigeon, some days you're the statue..."
Received on Mon Apr 22 2002 - 18:03:08 CDT