Re: separate data/inidex

From: Daniel A. Morgan <damorgan_at_exesolutions.com>
Date: Mon, 22 Apr 2002 22:29:05 +0100
Message-ID: <3CC480A0.F03F5ACA@exesolutions.com>

Great ... we are 70% of the way there. Lets see if we can but the top on the box and tie a ribbon around it.

I am looking at at Kevin Loney and Marlene Theriault's book Oracle 8i DBA Handbook, page 97, section title I/O Contention Among Datafiles. And the next page where Table 4-2 contains estimated I/O weights. This is one of the books I use to teach my classes at the U.

Based on this material my conclusion, and the one I teach my students, is that putting indexes into the same tablespace with data is a bad thing. In fact it seems to me that a reasonable conclusion would be that putting anything else with data is a bad thing. If Loney and Theriault are incorrect ... and we all agree that the goal is I/O distribution. Could you please provide an corrected table that would serve as a guideline as to where the I/O is? I don't mean a table that will be used to put your feet to the fire ... but something that, like Loney and Theriault's, can be used as a guideline.

Thanks. I hate the thought that I am teaching the next generation of Oracle developers and DBAs things that are not correct.

Daniel Morgan

Howard Rogers wrote:

> I took, several posts ago actually. Connor took too.
>
> The replacement advice is: even out the I/O distribution across all devices.
> If you want it more wordy, the Oracle 9i doco you quoted earlier seems to
> have got it about right: avoid placing contending datafiles on the same
> device. That covers everything, really, on the assumption that users don't
> leap to the erroneous conclusion that indexes and their tables inevitably
> contend.
>
> Regards
> HJR
>
> "Daniel Morgan" <damorgan_at_exesolutions.com> wrote in message
> news:3CC4969D.73F557F0_at_exesolutions.com...
> > I should have commented on that fact ... that you were requesting more
> than 30% of the
> > data in the table. That has a huge potential impact.
> >
> > Obviously in a table with 100,000,000+ rows only someone crazier than I
> would try a stunt
> > like that. And if they do I hope they use the +ALL_ROWS hint.
> >
> > In my experience queries like that are generally scheduled for 2:00am on a
> Sunday. The
> > more general query that returns all rows, or a large percentage of rows,
> is generally one
> > from a look-up table.
> >
> > So at the risk of rubbing salt into a wound ... it appears that everyone
> counseling not
> > separating tables and indexes is avoiding providing a replacement rule of
> thumb. Why no
> > takers? ;-) What is the verbiage you think would be valid at technet and
> tahiti as advice
> > for developers and DBAs?
> >
> > Daniel Morgan
> >
> >
> >
> > Connor McDonald wrote:
> >
> > > Daniel Morgan wrote:
> > > >
> > > > With 10,000 rows I'd not argue with you. In fact below 40,000 rows I'd
> not argue
> > > > with you. At 10,000 rows the only index I would create would be one
> created by a
> > > > required constraint. But try that same experiment with the 1.4 million
> parts in a
> > > > 747, understanding that every part may be in there multiple times (for
> example at
> > > > one time there was a choice 21 different pilot clipboards). Then add
> the sub-models
> > > > (different seating capacity, different galleys, etc.) then add 717,
> 737, 757, 767,
> > > > and 777 models.
> > > >
> > > > You are correct with the example you gave. But I can't believe that
> with
> > > > 100,000,000+ rows you would ever find this to be the case. Am I wrong?
> > > >
> > > > Daniel Morgan
> > > >
> > > > Connor McDonald wrote:
> > > >
> > > > > Daniel Morgan wrote:
> > > > > >
> > > > > > I agree. I thought he was saying that most of the time we do full
> table scans,
> > > > > > etc. And given that many of my application tables have been in the
> > > > > > multi-gigabyte range that would be a complete non-starter.
> > > > > >
> > > > > > When one is looking for a single part of a Boeing 747 ... one does
> not want to
> > > > > > scan every part for every plane ever made.
> > > > > >
> > > > > > Daniel Morgan
> > > > > >
> > > > > > Jonathan Lewis wrote:
> > > > > >
> > > > > > > Comments in-line
> > > > > > >
> > > > > > > --
> > > > > > > Jonathan Lewis
> > > > > > > http://www.jlcomp.demon.co.uk
> > > > > > >
> > > > > > > Author of:
> > > > > > > Practical Oracle 8i: Building Efficient Databases
> > > > > > >
> > > > > > > Next Seminar - Australia - July/August
> > > > > > > http://www.jlcomp.demon.co.uk/seminar.html
> > > > > > >
> > > > > > > Host to The Co-Operative Oracle Users' FAQ
> > > > > > > http://www.jlcomp.demon.co.uk/faq/ind_faq.html
> > > > > > >
> > > > > > > Daniel A. Morgan wrote in message
> <3CC2CB08.64C747AF_at_exesolutions.com>...
> > > > > > > >Please help me here with two things.
> > > > > > > >
> > > > > > > >1. "Nowadays to satisfy queries, you're full scanning tables,
> full scanning
> > > > > > > indexes,
> > > > > > > >not using indexes at all,...."
> > > > > > > >
> > > > > > >
> > > > > > > It is very easy to misunderstand the intent of a comment
> > > > > > > on the newsgroup, but I think if you re-read Connor's post
> > > > > > > carefully you will appreciate that he is not saying "you
> must
> > > > > > > full scan tables"; he is simply listing a number of the
> mechanism
> > > > > > > that are commonly (and necessarily) activated in a modern
> system.
> > > > > > >
> > > > > > > >Since when? I work very hard to not do these things.
> > > > > > > >
> > > > > > >
> > > > > > > Don't work too hard at it. Sometimes the biggest
> > > > > > > performance problems are introduced by trying
> > > > > > > to push Oracle too hard into paths which use
> > > > > > > index range scans and index unique scans to
> > > > > > > access tables by rowid; resulting in excessive
> > > > > > > CPU usage and CBC latch contention.
> > > > >
> > > > > I'll give an example of what I was trying to convey in terms of
> "full
> > > > > scans good sometimes, full scans bad sometimes" using an example
> > > > > somewhat poached from JL's excellent seminar
> > > > >
> > > > > create table BOEING1 ( x number, y char(100));
> > > > > create table BOEING2 ( x number, y char(100));
> > > > > create index BOEING1 on BOEING1 ( x);
> > > > > create index BOEING2IX on BOEING2IX ( x);
> > > > >
> > > > > 10000 rows inserted into each table, 200 rows for each value of x
> from 0
> > > > > to 49. So after running an analyze...
> > > > >
> > > > > analyze table BOEING1 compute statistics;
> > > > > analyze table BOEING2 compute statistics;
> > > > >
> > > > > we can see that they are virtually identical..
> > > > >
> > > > > select table_name, num_rows, blocks, empty_blocks
> > > > > from user_tables;
> > > > >
> > > > > TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
> > > > > ------------------------------ ---------- ---------- ------------
> > > > > BOEING1 9999 153 6
> > > > > BOEING2 9999 153 6
> > > > >
> > > > > select table_name, column_name, NUM_DISTINCT,DENSITY
> > > > > from user_tab_columns
> > > > > where TABLE_NAME like 'TE%';
> > > > >
> > > > > TABLE_NAME COLUMN_NAME
> > > > > NUM_DISTINCT DENSITY
> > > > > ------------------------------ ------------------------------
> > > > > ------------ ----------
> > > > > BOEING1 X
> > > > > 50 .02
> > > > > BOEING2 X
> > > > > 50 .02
> > > > >
> > > > > select max(x), min(x), avg(x) from BOEING1;
> > > > >
> > > > > MAX(X) MIN(X) AVG(X)
> > > > > ---------- ---------- ----------
> > > > > 49 0 24.5024502
> > > > >
> > > > > select max(x), min(x), avg(x) from BOEING2;
> > > > >
> > > > > MAX(X) MIN(X) AVG(X)
> > > > > ---------- ---------- ----------
> > > > > 49 0 24.5024502
> > > > >
> > > > > But they exhibit spectacularly different performance
> characteristics...
> > > > >
> > > > > select /*+ FULL(t) */ count(y) from BOEING2 t
> > > > > where x < 20;
> > > > >
> > > > > took 156 consistent gets
> > > > >
> > > > > select /*+ INDEX(t) */ count(y) from BOEING2 t
> > > > > where x < 20;
> > > > >
> > > > > took 70 consistent gets,
> > > > >
> > > > > thus you get the "conventional" result of "we should be using an
> index"
> > > > > when examing this particular BOEING. But the same queries against
> > > > > BOEING1 give
> > > > >
> > > > > select /*+ FULL(t) */ count(y) from BOEING1 t
> > > > > where x < 20;
> > > > >
> > > > > took 156 consistent gets
> > > > >
> > > > > select /*+ INDEX(t) */ count(y) from BOEING1 t
> > > > > where x < 20;
> > > > >
> > > > > took 3120 consistent gets (20 times worse!)
> > > > >
> > > > > So - sometimes checking every Boeing ever made is the best option
> :-)
> > > > >
> > > > > Cheers
> > > > > Connor
> > > > >
> > > > > --
> > > > > ==============================
> > > > > Connor McDonald
> > > > >
> > > > > http://www.oracledba.co.uk
> > > > >
> > > > > "Some days you're the pigeon, some days you're the statue..."
> > >
> > > The example I displayed is somewhat table size independent - that is,
> > > the queries are requesting approx 40% of the table rows. The example
> > > shows that in some cases, the index is easily the best option, in other
> > > cases, the table scan is easily the best.
> > >
> > > If you had 100,000,000 rows it wouldn't matter - if the question
> > > requires probing 40% of the rows, then the index vs scan decision still
> > > needs to be evaulated
> > >
> > > Cheers
> > > Connor
> > > --
> > > ==============================
> > > Connor McDonald
> > >
> > > http://www.oracledba.co.uk
> > >
> > > "Some days you're the pigeon, some days you're the statue..."
> >
Received on Mon Apr 22 2002 - 16:29:05 CDT