Re: All bitmap indexes not being used in a query where they should be

From: Steve B <BigBoote66_at_hotmail.com>
Date: 23 Jan 2004 15:00:40 -0800
Message-ID: <67bcf80a.0401231500.2bbffce0@posting.google.com>

Thanks for your input. Your idea about manually mucking with the index stats has me intrigued. Following your lead, responses in-line below:

"Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote in message news:<burjbt$gg4$1_at_sparta.btinternet.com>...
> Notes in-line:
>
> --
> Regards
>
> Jonathan Lewis
> http://www.jlcomp.demon.co.uk
>
> The educated person is not the person
> who can answer the questions, but the
> person who can question the answers -- T. Schick Jr
>
>
> Next public appearances:
> Jan 29th 2004 UKOUG Unix SIG - v$ and x$
> March 2004 Hotsos Symposium - The Burden of Proof
> March 2004 Charlotte NC OUG - CBO Tutorial
> April 2004 Iceland
>
>
> One-day tutorials:
> http://www.jlcomp.demon.co.uk/tutorial.html
>
>
> Three-day seminar:
> see http://www.jlcomp.demon.co.uk/seminar.html
> ____UK___February
>
>
> The Co-operative Oracle Users' FAQ
> http://www.jlcomp.demon.co.uk/faq/ind_faq.html
>
>
> "Steve B" <BigBoote66_at_hotmail.com> wrote in message
> news:67bcf80a.0401230646.3a740c56_at_posting.google.com...
> > The statistics are up to date, and the two indexes it is choosing are
> > the most restrictive, but the fact remains that the plan the optimizer
> > is choosing is, in fact, wrong (or more specifically, the cost the
> > optimizer is computing is wrong). This particular table has about 3.3
> > million rows, and the two indexes in question have about 2000 key
> > entries each, so "in theory", constraining on the two index should
> > divide the table up in to 4 million compartments. However, I sure
> > hope Oracle isn't using this method of naive math in order to choose
> > its plan, since it would be make the assumption that the columns in a
> > fact table are completely independent from one another (a very big
> > assumption, and almost certainly false for nearly all data
> > warehouses).
> >
>
> Hope and truth often diverge - that's really the way it is.
> Oracle 9 introduces dynamic sampling which, at one
> level, will check 32 blocks of the table to estimate the
> correlation between columns, but until you can use that
> feature, Oracle assumes that columns in the same table
> are independent variables.
>
> > Somehow the optimizer is thinking that the two indexes adequately
> > "cover" the data I want (even though they only reduce the result set
> > to 16,000 rows), so it views the inclusion of the other indexes as a
> > needless expense.
>
> Correct; that is what I explained.
>
> > But when it executes the count(*) query, it sees
> > that it can be resolved by never hitting the base table, so it uses
> > the rest of the indexes (the trace file shows the count(*) query is
> > "index only"). But these assumptions by the optimizer are incorrect.
>
> That is rather more interesting. There are several obvious
> bugs in the optimizer's calculations (or reports of results)
> when using bitmap indexes. Some of the oddities can only
> be seen through the 10053 trace, as explain plan doesn't
> report cost or cardinality on the bitmap index lines.
>
> > I'm wondering if there's some way of getting it to use all the bitmap
> > indexes, whether it thinks its the right idea or not.
> >
>
> Hard for me to say, as you're on a version of Oracle
> that I don't have. But one option is to lie to it. Use
> the dbms_stats.set_index_stats and set_column_stats
> procedures to change the statistics on those two columns
> so that Oracle thinks that those two indexes return far
> more rows in combination than Oracle currently thinks.
> (Try changing the num_distinct (and density) and
> distinct_keys to about 300 instead of 4,000).
> One of the problems with 8.1.6 (if I've got the versions
> right) is that it assumes that once it's calculated the
> number of rows you want, they're all very tightly
> packed - and this means it is biased against using
> 'just one more' index to reduce the result set.

Since I posted the original message, I managed to import the table into a 9.2.0.4.0 database, and the behavior is the same. I'm experimenting with computing stats with histograms, but even if it does work, I probably can't use it as it is taking prohibitively long to perform (I started the compute at 9am today, and 8 hours later it is still running). But the dbms_stats stuff might actually work, especially if I tweak some of my other indexes in the other direction - I may try for a "flat" stat plan that causes the optimizer to use as much as possible.

However, I'd be happier using a more "kosher" method of tuning.

> > Also, I think you may be misinterpreting the "Card" statistic in the
> > query plan. I have always understood it to mean that this is how many
> > times the optimizer anticipates executing that step in the plan. For
> > example, a full scan of a table can still be Card=1, even though it
> > may be returning millions of rows,
>
> In the absence of bugs, the card= line tells you how many rows
> Oracle expects to return from that line.

Well, you learn something new every day. I guess I was confused, probably because back when I was using Sybase, their iostat output did express cardinality in terms of the number of executions of a subplan (I still miss the way Sybase would show you i/o statistics for each object in your query, instead of lumping it all together like Oracle does).

> > but a join from say an Invoice
> > table to an LineItem table will often have a Card=N, where N are the
> > number of invoices you expect to match in the query (not the average
> > or total number of LineItems you expect to receive). The Cost
> > statistic reflects how expensive a particular step is (I'm not sure
> > whether the Cost already includes the Card multiplier or not).
> >
> > "Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote in message
> news:<buo4d6$r5f$1_at_sparta.btinternet.com>...
> > > Note the line in the execution plan that reads:
> > > 3 2 TABLE ACCESS (BY INDEX ROWID) OF 'EXPOSUREANALYSIS'
> > > (Cost=7 Card=1 Bytes=312)
> > >
> > > Oracle has calculated that the number of rows
> > > that will be identified by the combination of just
> > > these two indexes is 1 - so it would be counter-
> > > productive to visit any more indexes.
> > >
> > > I don't have 8.1.6 around, so I can't tell you
> > > if this is a bug, or a result of your statistics;
> > > but check
> > > user_tab_columns.num_distinct
> > > for the two used indexes. Oracle should have
> > > worked out the number of rows that will be
> > > returned by multiplying the two num_distinct
> > > together and dividing into the total number of
> > > rows in the table. (Technically, it should have
> > > used the density column - but unless you have
> > > histograms, that will be 1/num_distinct).
> > >
> > >
Received on Fri Jan 23 2004 - 17:00:40 CST