RE: Index Rebuilds (aka scan a smaller index)

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 1 Aug 2006 10:01:18 -0400
Message-ID: <KNEIIDHFLNJDHOOCFCDKEELCIFAA.mwf@rsiz.com>

The bitmap would be a new index on just the type.

I believe index fragmentation has an operational order greater than n. If that is true, then partitioning would be a superlinear improvement.

Your concerns were valid and I should have been clearer about why I thought this might help.

Now, let's move on to what l_m_d is being used for.

Usually, l_m_d usage falls into two categories:

Documentation for research.
Operational trigger to cause additional processing.

Usually usage #1 doesn't cause too much of a problem because the need is relatively infrequent compared to operational drivers. In general the volatility of change can be reduced in this case by quantizing the time keeping unit. (Date truncated to year, month, week, day).
The coarser the quantization, the less the volatility (presuming the quantization crosses the mean of the rate of change curve - it is possible to not win, but you can't lose; for example if you already only change an individual record on average once a day and you quantize from seconds to minutes you do not reduce the volatility. If the average change is once a second and you quantize to day you reduce volatility by a factor of 86400.

Usage #2, however often represents a huge opportunity. If the meaning of l_m_d is "Hey, I changed! Post process me!" Then the index you actually want is date_I_changed_since_full_post_processing, which gets set to NULL when post processing is complete. You then pull from this index to get candidates to work upon. Since it is highly volatile, it is perversely not a good candidate for bitmap, but punching to NULL keeps it small. This is a reasonably neat trick in Oracle, what with the all NULL index values being dropped, so the total size stays pretty small (again this is subject to the rates of flux and rate of processing to the point you punch it to NULL.) A frequent objection is that you lose the datestamp upon which final processing was complete, but that is a case for an additional column that is used for documentation, not as an operational trigger. The architecture of the database structures to maintain these queues is largely a matter of taste. If I'm not building a replayable, batch contol totalled hard queue structure partially outside Oracle, I have a marginal preference for a separate table with just the primary key of the main table and the nullable column to indicate that work is needed, and possibly a priority rank (though the priority rank is a relational slippery slope to a whole array of columns used to decide fine grained processing priority such as promise date, penalty date, etc. If this is an issue, it is usually best to just have the key and the date (or an even smaller flag), and leave the AI about who gets processed first from the candidates thus identified to a sufficiently complex structure designed for the purpose that you can consider for a small subset of all rows.)

Additionally there are whole systems for handling operational processing including at least Oracle triggers, job scheduler, workflow, the e-Business suite concurrent manger, BPEL, and custom hard queues.

None of these suggestions for working around the problem actually deals with the problem you are currently observing. I'm not sure there is a reasonable solution to keeping a highly volatile index as small as possible. Roughly 11 to 64, about a factor of 6, seems a little high to me compared to Bayer tree mathematics, but Oracle has considerably "fancied up" from a bare bones Bayer tree, and l_m_d has that old, hard to handle monotonically increasing problem. Still, Oracle might be interested in your exact data with a way to replay it from "good" (11K) to "bad" (64K) so they can see whether they are missing an opportunity to do even better. Trying to stamp out every excuse to rebuild as a solution is a useful quest.

There is also a very real possibility that you're using the l_m_d for something entirely different that none of these suggestions are appropriate for your case.

Regards,

mwf

-----Original Message-----
From: Stalin [mailto:stalinsk_at_gmail.com] Sent: Monday, July 31, 2006 2:45 AM
To: mwf_at_rsiz.com
Subject: Re: Index Rebuilds (aka scan a smaller index)

Hi Mark,

On 7/30/06, Mark W. Farnham <mwf_at_rsiz.com> wrote:
> Others have already suggested the rownum=1 (or rownum<2) trick if you're
> really just checking for existence.

As you have already seen my earlier replies, they are used for both processing and to check existence.

> Barring that, isn't this pretty much what bit map indexes were created
for?
> True, you'll have another index, but IF this query is important and
> frequently issued AND response time is important to the business, that is
> one way to make it a lot quicker.
>

Yep i agree that bitmap index would do the trick, but this the most busiest table in terms of inserts/updates and bitmap index won't fit in here.

> Alternatively, you could partition by type and at least get some pruning.
>

hmm i think the problem would still be there as every partition of type would still have l_m_d index fragmented no?

> If it gets into the level of importance of extreme solutions, you could
map
> your types to numbers putting 'live' low and index "type_id" by itself.
Then
> type_id <= 1 (ie. the lookup of 'live' in the table types table) should be
> pretty quick.
>

yep. separting type from the multi column index would solve this query as the type is not modified that often as l_m_d. However that leaves another question of how to slove issues with indexes like l_m_d that are being updated more often. Is rebuild the only option?

> This is partly presuming that changes to type are less volatile than
changes
> to l_m_d, but even so it should be much smaller. The type being volatile
> would mitigate against the bitmap and partitioning solutions.
>
> Now if this isn't either frequent or very important to the business that
it
> is quick when the question is asked, then you've identified a case of CTD
> (complusive tuning disorder) versus a case for an extreme solution.
>
> Of course you could be exploring this for academic reasons to add to your
> kit bag of solutions. That's always my excuse when I realize I've
succumbed
> to CTD. Oh, wait, I don't have CTD, I'm doing research!
>

Nope, this is not CTD excercise. Looking for solution on a real production problem.

> Regards,
>
> mwf
>
> -----Original Message-----
> From: oracle-l-bounce_at_freelists.org

[mailto:oracle-l-bounce_at_freelists.org]On
> Behalf Of Stalin
> Sent: Friday, July 28, 2006 7:08 PM
> To: Oracle Discussion List
> Subject: Index Rebuilds

<snip, the list has seen the rest>

--
http://www.freelists.org/webpage/oracle-l

Received on Tue Aug 01 2006 - 09:01:18 CDT