null values

From: Martijn Bos <maboc_at_maboc.nl>
Date: Sun, 17 Nov 2013 22:50:33 +0100
Message-ID: <20131117215032.GA14656_at_app01.bos>



Hi List,

On a play-around sytem I do the following: (This is a 11.2.0.4 RDBMS on linux x86 (32 bits))

drop table cbo_tab_1;
create table cbo_tab_1 (id number, sca number(10,0), clu number(10,0), null_col number(10,0), filler char(500)) pctfree 99 pctused 1;

  • Following statement makes sure that ther actually are nulls in column null_col insert into cbo_tab_1 select level, mod(level,10), round(level/100), decode(mod(level, 10), 0, null, level), 'filler' from dual connect by level<=1000; commit;

create unique index CBO_TAB_1_ID on CBO_TAB_1(id);
-- create index cbo_tab_1_id on cbo_tab_1(id);

create index cbo_tab_1_sca on cbo_tab_1(sca);
create index cbo_tab_1_clu on cbo_tab_1(clu);
create index cbo_tab_1_null_col on cbo_tab_1(null_col);

exec dbms_stats.gather_table_stats('MARTIJN','CBO_TAB_1', block_sample=>false, cascade=>true, method_opt=>'for all columns size 1'); alter system flush shared_pool; --Since I try it a lot I need it to parse over and over again.

explain plan for select count(null_col) from cbo_tab_1; --HERE IS THE SQL OF WHICH I DON'T UNDERSTAND THE EXPLAIN PLAN

SQL> _at_xpln -- A script I created just to get a little more understanding of explaining statement.

ID         OPERATION                                OBJECT                                   ALIAS                CARDINALITY       COST OPTIMIZER            BYTES

---------- ---------------------------------------- ---------------------------------------- -------------------- ----------- ---------- --------------- ----------
--- 0 SELECT STATEMENT() 1 2 ALL_ROWS 4
--P 1 SORT(AGGREGATE) 1 4
--P 2 INDEX(FAST FULL SCAN) (INDEX) MARTIJN.CBO_TAB_1_NULL_COL CBO_TAB_1_at_SEL$1 1000 2 ANALYZED 4000

    ID PROJECTION
------ ---------------------------------------------------------------------------

     1 (#keys=0) COUNT("NULL_COL")[22]
     2 "NULL_COL"[NUMBER,22]

SQL> (if you find that suspect, I also added a more documented way to display explain plans)

SQL> select * from table(dbms_xplan.display(format=>'all'));

PLAN_TABLE_OUTPUT



Plan hash value: 2862360477

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
|   0 | SELECT STATEMENT      |                    |     1 |     4 |     2   (0)| 00:02:01 |
|   1 |  SORT AGGREGATE       |                    |     1 |     4 |            |          |
|   2 |   INDEX FAST FULL SCAN| CBO_TAB_1_NULL_COL |  1000 |  4000 |     2   (0)| 00:02:01 |

--------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):


   1 - SEL$1
   2 - SEL$1 / CBO_TAB_1_at_SEL$1

Column Projection Information (identified by operation id):


   1 - (#keys=0) COUNT("NULL_COL")[22]
   2 - "NULL_COL"[NUMBER,22]
SQL> Now the thing that is troubling me is the cardinality estimate at id 2 (1000). I would say that the CBO could esitmate 900 rows. Where does the CBO get's that idea of 1000?

SQL> select TABLE_NAME, STATUS, PCT_FREE, PCT_USED, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, LAST_ANALYZED, AVG_ROW_LEN from user_tables
where table_name='CBO_TAB_1';

TABLE_NAME STATUS PCT_FREE PCT_USED NUM_ROWS BLOCKS EMPTY_BLOCKS LAST_ANALYZED AVG_ROW_LEN
--------------- -------- ---------- ---------- ---------- ---------- ------------ -------------------- -----------

CBO_TAB_1       VALID            99                  1000       1000            0 2013-NOV-17 22:27:10         515

SQL> select index_name, index_type, uniqueness, blevel, leaf_blocks, distinct_keys, clustering_factor, num_rows, last_analyzed from user_indexes
where table_name='CBO_TAB_1';

INDEX_NAME                     INDEX_TYPE                  UNIQUENES     BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   NUM_ROWS LAST_ANALYZED

------------------------------ --------------------------- --------- ---------- ----------- ------------- ----------------- ---------- --------------------
CBO_TAB_1_ID NORMAL UNIQUE 1 2 1000 1000 1000 2013-NOV-17 22:27:10 CBO_TAB_1_SCA NORMAL NONUNIQUE 1 2 10 1000 1000 2013-NOV-17 22:27:10 CBO_TAB_1_CLU NORMAL NONUNIQUE 1 2 11 1000 1000 2013-NOV-17 22:27:10 CBO_TAB_1_NULL_COL NORMAL NONUNIQUE 1 2 900 900 900 2013-NOV-17 22:27:10
SQL> As far as I'm concerned/understand the CBO chooses the right index (CBO_TAB_1_NULL_COL). CBO_TAB_1_NULL_COL has 900 num rows, since the other 100 rows are null. I would say that in this specific case it would be easy for the CBO to estimate 900 rows since it only visits this index (all information is in the index).

Now I'm thinking that I might oversee something very simple (or that i'm flat out wrong), so would anyone be so kind as to point me in the right direction for some guidance, or (better yet :-)) explain where my thinking is wrong.

Best Regards,
Martijn



--

http://www.freelists.org/webpage/oracle-l Received on Sun Nov 17 2013 - 22:50:33 CET

Original text of this message