Re: separate data/inidex

From: D.Y. <dyou98_at_aol.com>
Date: 30 Apr 2002 11:07:27 -0700
Message-ID: <f369a0eb.0204301007.5ac55602@posting.google.com>

Thomas Kyte <tkyte_at_oracle.com> wrote in message news:<aam0i3091q_at_drn.newsguy.com>...
> In article <f369a0eb.0204292044.4fad1e36_at_posting.google.com>, dyou98_at_aol.com
> says...
> >
>
> ...
>
> >>
> >> Bong! Nope, though I plead guilty to having recommended frequent rebuilds as
> >> recently as last year.
> >
> >You need not to.
> >
>
> I find people waste a HUGE amount of time and resources rebuilding indexes. It
> is good to recant.
>
> >>
> >> First: when was the last time you ever saw an Oracle index with a height
> >> greater than 3? It's extremely hard to achieve, actually, because Oracle's
> >> re-balancing mechanism is pretty good. So 'reducing the depth of the B tree'
> >> is not an issue for 99.5% of all indexes out there.
> >
> >Ah, we are getting into another interesting discussion. I say "possibly" not
> >"always". Your quote of 99.5% sounds a little too high to me but you could
> >be correct. I heard more than one Senior DBAs say it always takes 3 disk reads
> >to retrieve the rowid. See, there are quite a few people out there who think
> >all the indexes have three levels in their B trees!
> >
>
> Sr, smeener. It might take zero (indexes are pretty cachable).

Possible in some systems. My indexes are routinely GBs to dozens of GBs in size so I can't always expect that to happen.

>
> >>
> >> "Half empty blocks"? I hope you're not implying that empty space in a leaf
> >> node can never be re-used, because it can. If you delete 'Bob' from the
> >> table, the index on the name column has a deleted leaf node entry for 'Bob',
> >> true enough, taking up space and being a waste of space. But if you were to
> >> insert a record for 'Bruce', that deleted entry is cleared out, and the
> >> space allocated to Bruce's index entry. Given a reasonably random pattern of
> >> insert activity, it is extemely unlikely that you'll have half-empty blocks
> >> sitting there for very long.
> >>
> >> Third, a rebuild causes the index to be re-constructed nice and compactly,
> >> with no spare space, true enough. That might give you a bit of a lift in
> >> performance when reading that index immediately after the rebuild. But the
> >> next piece of DML you perform is going to cause the index to start keeling
> >> over again soon enough. At which point the index has to start re-acquiring
> >> all the extents it lost as a result of the rebuild, but which it had already
> >> gone to the trouble of acquiring before the rebuild. Extent acquisition is
> >> not cheap. In fact, in dictionary managed tablespace, it's bloody expensive!
> >> Therefore, the index actually has to repeat a lot of extremely expensive
> >> dictionary-management activity that, had you not rebuilt it, it would not
> >> have had to do. Performance can therefore actually be worse after a rebuild
> >> than before. Of course, in locally-managed tablespace, that argument is
> >> rather less strong than it used to be.
> >
> >In gereral, I find index rebuild make performance somewhat better if not much
> >better.
>
> Interesting, how do you measure this? what approach do you use? what metrics
> do you provide?
>
> I've always asked people if they do this, no one ever seems to. How have you
> quantified this in a measurable way?

In a TB database, when your nightly batch only takes 4 hours now instead of 6 or 7 hours and the only change you made was rebuilding some huge indexes, people naturally think defrag made a different. Your batch process gets slow again in a month or so and the cycle starts over ... Certainly wouldn't rule out any side effects of index rebuild that could have contributed to performance gain. But I didn't give it enough thought.

>
> > Performance can be worse but those are rare exceptions and I haven't
> >seen one. If you set your parameters appropriately it normally won't happen.
> >Almost all the DBAs have rebuilt indexes.
>
> almost all dba's separate tables from indexes on difference devices.
> many dba's think 1 extent is really cool.

Wouldn't bother with making everything fit within 1 extent. But I sure don't like to have too many big holes in my index blocks.

> tons of them think the most discriminating field needs to go first in an index
> and so on....
>

That has been how I create indexes until compression became available. Not that I needed to do it for performance reasons. I just didn't see anything wrong with it. I've tried hard to come up with a reason why many people use the most selective field as the leading column. The only thing I can think of is the search algorithm. If binary search is used to retrieve an index key, it may require fewer CPU cycles when you hit the most selective column first. If Oracle uses a clever algorithm, it may not make a difference at all. I hope someone with the knowledge will explain.

>
> > If they find it help performance
> >they keep doing it. Otherwise they stop.
>
> I find NEITHER to be true.
>
> A) they never measure it
> B) they *never* stop.

If this is true then they need to stop to think.

>
>
> > Don't take the phrase "half empty"
> >at its face value. I really meant to say fragmentation. As you explained,
> >index space can be reused. But it's not ALWAYS reused, nor is it NEVER reused
> >as is the impression one might get from reading the documentation. How
> >fragmented an index can be depends on the data. If the density of your index
> >key is more or less uniform you won't have much fragmentation. If your data is
> >highly skewed both ways, whether you use a regular or reversed key index you
> >are going to have significant fragmentation over time and rebuilding this type
> >of indexes will certainly help.
> >
> >To find out what's the best thing to do you need to understand the access
> >pattern and data distribution of your application. That will often point you
> >in the right direction (assuming you understand the storage and access
> >mechanism of your database). This approach has helped me much more than simply
> >following some rules of thumb, or what we call "myths".
> >
> >>
> >> So, there's a balance to be struck. Tom Kyte once told me that he could
> >> count the number of times he's rebuilt an index in his entire life on the
> >> fingers (not thumbs) of both hands. That strikes me as a slightly
> >> too-Puritan approach! But Jonathan Lewis has a line in his book which reads
> >> (paraphrasing wildly) 'don't strive too officiously to achieve a mythical
> >> 100% efficiency for your indexes, because they'll never stay 100% efficient
> >> for long' -meaning, regular rebuilds are a waste of time. So somewhere
> >> between never and always is an appropriate rebuild frequency, but it's
> >> definitely on the 'occasional' end of the spectrum.
> >
> >That's the point. It's about BALANCE. And it's driven by your application.
> >
> >>
> >> Regards
> >> HJR
Received on Tue Apr 30 2002 - 13:07:27 CDT