Antwort: is SIZE AUTO ever wrong? or am I wrong?

From: Martin Klier <Martin.Klier_at_klug-is.de>
Date: Fri, 20 Nov 2009 18:14:01 +0100
Message-ID: <OFF202F016.4620045B-ONC1257674.005E9B19-C1257674.005EAAE6_at_klug-is.de>



Hi,

Richard Foote did some really interesting things to prove that this feature is dangerous.
http://richardfoote.files.wordpress.com/2008/01/dbms_stats-method_opt-auto-dangers-demo-version-3.txt

--
Mit freundlichem Gruß


Martin Klier
Senior Oracle Database Administrator
------------------------------------------------------------------------------

Klug GmbH integrierte Systeme
Lindenweg 13, D-92552 Teunz
Tel.:  +49 9671/9216-245
Fax.: +49 9671/9216-112
mailto: martin.klier_at_klug-is.de
www.klug-is.de
------------------------------------------------------------------------------

Geschäftsführer: Johann Klug, Roman Sorgenfrei
Sitz der Gesellschaft: Teunz, USt-ID-Nr. DE175481608,
HRB Nr. 2037, Amtsgericht Amberg



|------------>
| Von: |
|------------>
>--------------------------------------------------------------------------------------------------------------------------------------------------| |neil kodner <nkodner_at_gmail.com> | >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| An: |
|------------>
>--------------------------------------------------------------------------------------------------------------------------------------------------| |oracle-l List <oracle-l_at_freelists.org> | >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Datum: |
|------------>
>--------------------------------------------------------------------------------------------------------------------------------------------------| |20.11.2009 18:08 | >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Betreff: |
|------------>
>--------------------------------------------------------------------------------------------------------------------------------------------------| |is SIZE AUTO ever wrong? or am I wrong? | >--------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Gesendet | | von: |
|------------>
>--------------------------------------------------------------------------------------------------------------------------------------------------| |oracle-l-bounce_at_freelists.org | >--------------------------------------------------------------------------------------------------------------------------------------------------| Does Oracle ever make the wrong decision when using FOR COLUMN column_name SIZE AUTO? I ask because I have table letter_bin with 4778159 rows, claims with 487842 rows. letter_bin is indexed by both clm_id and prty_id Here are the distributions of the counts of the columns: table claims column clmt_prty_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 319156 1 29 1.52855657 .939775607 table letter_bin column prty_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 234326 1 1712371 20.3912157 3542.44185 table letter_bin column clm_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 470855 1 1424 10.147916 7.88601772 I just now realized that the numbers for letter_bin include NULLs. excluding NULLs, I get table letter_bin column clm_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 470859 1 1424 10.1464345 7.82825864 table letter_bin column prty_id count min max avg stddev ---------- ---------- ---------- ---------- ---------- 234326 1 42314 13.0835759 188.911964 letter_bin is typically queried by both clm_id and prty_id at the same time. although both columns are NULL able, when the table is queried in a certain fashion, both values are used, one as a literal and one as a result of a join. prty_id is supplied, clm_id is the result of querying by clmt_prty_id I analyzed letter bin using for columns prty_id size auto, and for columns clm_id size auto. both 100% samples, execute separately. When I compute statistics on column prty_id, I get 254 buckets.which I expect. When I compute statistics on clm_id, I get no histogram although I think I should be expecting one. To me, the concept of 'skewed' is somewhat nebulous but I think I ought to get a histogram here. I prefer letter_bin to by clm_id and not by prty_id. I say this because the average number of rows in letter_bin for a clm_id should be 10 with a lower SD an example query would be along the lines of select xyz from letter_bin a,claims b where a.prty_id = 123 and a.clm_id=b.clm_id and b.clmt_prty_id=456. The query typically uses access path of letter_bin-party_id and then join to claims. I would expect it to be the other way around. select count(*) "count",count(distinct &2)"distinct",min(cnt)"min", max (cnt)"max", avg(cnt)"avg", stddev(cnt)"stddev" from ( select /*+ PARALLEL (a 6) */ &2, count(*) cnt from &1 a group by &2 ); Now here's where the plot thickens: This query is executed thousands of times in a batch job. I think it would be more efficient for letter_bin to be accessed by clm_id, and its not. It's getting accessed by prty_id and HJ to claims. Am I being unreasonable, or are my stats no good? -- http://www.freelists.org/webpage/oracle-l -- http://www.freelists.org/webpage/oracle-l
Received on Fri Nov 20 2009 - 11:14:01 CST

Original text of this message