indexing null values curious case?

From: Vishnu Potukanuma <vishnupotukanuma_at_gmail.com>
Date: Thu, 21 Nov 2019 13:47:09 +0530
Message-ID: <CAP-RywwruTs8Sa1wC4YJKZsq+TX-qkX6HS+woAuvpfRu-WcxnQ_at_mail.gmail.com>



Hi,

If all the columns in the index are null then the entries are not stored in the index. This is a known fact and empty strings are stored as null values. combining both breaks the consistency part where the query returns wrong results.

this is really a worst case as no one creates such a index as (column_name,null) or (column_name,'') as this really doesn't make sense but this breaks the consistency aspect of database as the query gives wrong result as the optimizer consider index even when the leading column is null.

the scenario is basically as following:
Create table students (student_id number, name varchar2(20); insert into students select rownum, dbms_random.string(0,20) form dual connect by level < 1000000;
insert into students (name) select dbms_random.string(0,20) from dual connect by level < 100;
commit;
create index idx on student(student_id,null); exec dbms_stats.gather_Table_stats('VISHNU','students',cascade=>True); select * from students where student_id is null; in this case it goes with the full table scan which is correct since both nulls are not stored.

so now we drop the index idx;
Here we create the index as this.
create index idx on student(student_id,''); exec dbms_stats.gather_Table_stats('VISHNU','students',cascade=>True); select * from students where student_id is null; now Oracle retrives the results as 0, with the index scan.

the leaf blocks store the details as following since the empty strings are treated as nulls
row#0[8019] flag: -------, lock: 0, len=13

col 0; len 2; (2):  c1 02
col 1; NULL
col 2; len 6; (6):  01 c0 21 dc 00 00

row#1[8006] flag: -------, lock: 2, len=13
col 0; len 2; (2):  c1 03
col 1; NULL
col 2; len 6; (6):  01 c0 21 dc 00 02

 but things as expected, index doesn't store entries when the leading column is null.

here the oracle goes with the index scan and returns zero results. which is inconsistent result.
optimizer trace considers that index with (student_id, '');

  Access Path: TableScan
    Cost: 21017.655599 Resp: 21017.655599 Degree: 0

      Cost_io: 20995.000000  Cost_cpu: 807173669
      Resp_io: 20995.000000  Resp_cpu: 807173669
  Access Path: index (index (FFS))

    Index: IDX
    resc_io: 1039.000000 resc_cpu: 237414695     ix_sel: 0.000000 ix_sel_with_filters: 1.000000   Access Path: index (FFS)
    Cost: 1045.663711 Resp: 1045.663711 Degree: 1
      Cost_io: 1039.000000  Cost_cpu: 237414695
      Resp_io: 1039.000000  Resp_cpu: 237414695

probably we are hitting an undeveloped section of optimizer? producing a wrong plan is ok, but producing wrong results?

Thanks,
vishnu

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Nov 21 2019 - 09:17:09 CET

Original text of this message