Re: Better cardinality estimate when dialing optimizer_features_enable back
Date: Tue, 24 Nov 2009 15:12:01 -0500
Sorry, I fixed permissions on the two google spreadsheets.
Here's the histogram
and the table/index/column stats
To answer your initial question, I think index access does in fact make sense. I'm retrieving anywhere from one to a thousand rows from letter_bin per call. This query is executed thousands of times as part of a batch job.
As a frame of reference, the table represents letters being mailed to an employer(prty_id), for every employee. There are many small businesses each having a few rows. There are medium and large businesses that have many rows. And finally there are employment agencies which have a LOT of rows. Hopefully this explains the distribution of data.
As far as the transitive closure issue, there are also other queries which query letter_bin by party_id without the join to parties - those are affected as well. The cardinality is always incorrect for any query by letter_bin.party_id.
On Tue, Nov 24, 2009 at 2:56 PM, Randolf Geist <info_at_sqltools-plusplus.org>wrote:
> > Here is the histogram on letter_bin.prty_id
> > http://spreadsheets.google.com/ccc?key=0ArUTHdqN0J--dDVXZ1ZPTGlROUxfU0F
> > XV3RNdFVINlE&hl=en
> > I've never manually crafted a histogram before.
> I can't access that Google spreadsheet - it tells me I lack privileges to
> do so.
> How many distinct values are in the column - how is the actual
> Are there values that are so rare that the index access makes actually
> sense? If not consider simply removing the histogram - or consider removing
> the index.
> Instead of fiddling with the histogram itself you can simply try to set the
> DENSITY manually of the column using DBMS_STATS.SET_COLUMN_STATS to the
> value currently shown in DENSITY - according to Alberto's post this should
> revert to the old density calculation using the DENSITY value (the new
> calculation ignores the value from the dictionary), but I haven't tested
> this recently myself. That should modify the cardinality estimate of
> unpopular column values.
> The plans that you've posted: There is transitive closure taking place on
> PRTY_ID between LETTER_BIN and PARTIES - which leads to the side effect that
> the optimizer removed the join condition between LETTER_BIN and PARTIES.
> With that in place the cardinality estimate might be lower and closer to
> reality. It's just a workaround and does not address the root cause but what
> do you get if you add manually the "PARTIES.PRTY_ID = <val>" to the queries?
> This workaround should retain the join predicate and thus lower the
> cardinality estimate.
> Oracle related stuff blog:
> Co-author of the forthcoming "OakTable Expert Oracle Practices" book:
> GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://movieflat.web.de