RE: null values

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Tue, 19 Nov 2013 01:21:06 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D901DCA722_at_exmbx05.thus.corp>


Martijn,

I haven't had time to look at this, but it wouldn't surprise me in the slightest that you've managed to find an optimizer error with an index fast full scan - it wouldn't be the first time that the optimizer has used a table-based strategy to handle an index-based execution path.

Note particularly that you have no predicate on any of the columns, so the selectivity of your query is necessarily 1 - at which point Oracle can detect that there are 1,000 rows in the table (which you would HAVE to visit if you did a full tablescan), but fails to allow for the fact that if you use an index fast full scan the number of index entries is smaller than the number of index entries.

I recall, though that there was a "magic" 10% relating to null-related calculations that appeared some time ago, so possibly your example is on the boundary of a change in strategy. It would be interesting to see what happens in a more extreme case - e.g. create the table and index with 50% of the rows set to null (and create the table with a much larger volume of data anyway (small number often introduce special cases).

Obviously a 10053 trace might give you some clue that Oracle is, very specifically, failing to consider the number of rows in the index when doing its calculations.

Regards
Jonathan Lewis



From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of Martijn Bos [maboc_at_maboc.nl] Sent: 17 November 2013 21:50
To: Oracle-L
Subject: null values

Hi List,

On a play-around sytem I do the following: (This is a 11.2.0.4 RDBMS on linux x86 (32 bits))

drop table cbo_tab_1;
create table cbo_tab_1 (id number, sca number(10,0), clu number(10,0), null_col number(10,0), filler char(500)) pctfree 99 pctused 1;

  • Following statement makes sure that ther actually are nulls in column null_col insert into cbo_tab_1 select level, mod(level,10), round(level/100), decode(mod(level, 10), 0, null, level), 'filler' from dual connect by level<=1000; commit;

create unique index CBO_TAB_1_ID on CBO_TAB_1(id);
-- create index cbo_tab_1_id on cbo_tab_1(id);

create index cbo_tab_1_sca on cbo_tab_1(sca);
create index cbo_tab_1_clu on cbo_tab_1(clu);
create index cbo_tab_1_null_col on cbo_tab_1(null_col);

exec dbms_stats.gather_table_stats('MARTIJN','CBO_TAB_1', block_sample=>false, cascade=>true, method_opt=>'for all columns size 1'); alter system flush shared_pool; --Since I try it a lot I need it to parse over and over again.

explain plan for select count(null_col) from cbo_tab_1; --HERE IS THE SQL OF WHICH I DON'T UNDERSTAND THE EXPLAIN PLAN

SQL> _at_xpln -- A script I created just to get a little more understanding of explaining statement.

ID         OPERATION                                OBJECT                                   ALIAS                CARDINALITY       COST OPTIMIZER            BYTES

---------- ---------------------------------------- ---------------------------------------- -------------------- ----------- ---------- --------------- ----------
--- 0 SELECT STATEMENT() 1 2 ALL_ROWS 4
--P 1 SORT(AGGREGATE) 1 4
--P 2 INDEX(FAST FULL SCAN) (INDEX) MARTIJN.CBO_TAB_1_NULL_COL CBO_TAB_1_at_SEL$1 1000 2 ANALYZED 4000

    ID PROJECTION
------ ---------------------------------------------------------------------------

     1 (#keys=0) COUNT("NULL_COL")[22]
     2 "NULL_COL"[NUMBER,22]

SQL> (if you find that suspect, I also added a more documented way to display explain plans)

SQL> select * from table(dbms_xplan.display(format=>'all'));

PLAN_TABLE_OUTPUT



Plan hash value: 2862360477

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
|   0 | SELECT STATEMENT      |                    |     1 |     4 |     2   (0)| 00:02:01 |
|   1 |  SORT AGGREGATE       |                    |     1 |     4 |            |          |
|   2 |   INDEX FAST FULL SCAN| CBO_TAB_1_NULL_COL |  1000 |  4000 |     2   (0)| 00:02:01 |

--------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):


   1 - SEL$1
   2 - SEL$1 / CBO_TAB_1_at_SEL$1

Column Projection Information (identified by operation id):


   1 - (#keys=0) COUNT("NULL_COL")[22]
   2 - "NULL_COL"[NUMBER,22]
SQL> Now the thing that is troubling me is the cardinality estimate at id 2 (1000). I would say that the CBO could esitmate 900 rows. Where does the CBO get's that idea of 1000?

SQL> select TABLE_NAME, STATUS, PCT_FREE, PCT_USED, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED, AVG_ROW_LEN from user_tables
where table_name='CBO_TAB_1';

TABLE_NAME STATUS PCT_FREE PCT_USED NUM_ROWS BLOCKS EMPTY_BLOCKS LAST_ANALYZED AVG_ROW_LEN
--------------- -------- ---------- ---------- ---------- ---------- ------------ -------------------- -----------

CBO_TAB_1       VALID            99                  1000       1000            0 2013-NOV-17 22:27:10         515

SQL> select index_name, index_type, uniqueness, blevel, leaf_blocks, distinct_keys, clustering_factor, num_rows, last_analyzed from user_indexes
where table_name='CBO_TAB_1';

INDEX_NAME                     INDEX_TYPE                  UNIQUENES     BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS LAST_ANALYZED

------------------------------ --------------------------- --------- ---------- ----------- ------------- ----------------- ---------- --------------------
CBO_TAB_1_ID NORMAL UNIQUE 1 2 1000 1000 1000 2013-NOV-17 22:27:10 CBO_TAB_1_SCA NORMAL NONUNIQUE 1 2 10 1000 1000 2013-NOV-17 22:27:10 CBO_TAB_1_CLU NORMAL NONUNIQUE 1 2 11 1000 1000 2013-NOV-17 22:27:10 CBO_TAB_1_NULL_COL NORMAL NONUNIQUE 1 2 900 900 900 2013-NOV-17 22:27:10
SQL> As far as I'm concerned/understand the CBO chooses the right index (CBO_TAB_1_NULL_COL). CBO_TAB_1_NULL_COL has 900 num rows, since the other 100 rows are null. I would say that in this specific case it would be easy for the CBO to estimate 900 rows since it only visits this index (all information is in the index).

Now I'm thinking that I might oversee something very simple (or that i'm flat out wrong), so would anyone be so kind as to point me in the right direction for some guidance, or (better yet :-)) explain where my thinking is wrong.

Best Regards,
Martijn--
http://www.freelists.org/webpage/oracle-l Received on Tue Nov 19 2013 - 02:21:06 CET

Original text of this message