simple query with strange cardinality estimate

From: Redacted sender <dmarc-noreply_at_freelists.org>
Date: Tue, 19 Jan 2021 18:09:56 +0000
Message-ID: <CY4PR0701MB370002756380A31D529F51A283A30_at_CY4PR0701MB3700.namprd07.prod.outlook.com>



Hi,

I'm puzzled with some strange cardinality estimates that the optimizer is giving on a 19c database (v$version shows 19.4.0.0.0, optimizer_features_enable is not set and shows 19.1.0 as its default). I created a test case using a simple table with two columns. The table has no rows but I've set the statistics to tell the optimizer that there about 45 million rows.

CREATE TABLE T1 (ID NUMBER(20), V VARCHAR2(20 CHAR)); CREATE UNIQUE INDEX PK_ID ON T1(ID);
ALTER TABLE T1 ADD (CONSTRAINT PK_ID PRIMARY KEY (ID) USING INDEX PK_ID ENABLE VALIDATE); EXEC DBMS_STATS.GATHER_TABLE_STATS(user, 'T1');

DECLARE

   srec               DBMS_STATS.STATREC;
   numvals            DBMS_STATS.NUMARRAY;
   charvals           DBMS_STATS.CHARARRAY;
BEGIN    DBMS_STATS.set_table_stats(ownname => user, tabname => 'T1', numrows => 45262481, numblks => 1938304, avgrlen => 206);    numvals := DBMS_STATS.numarray (1, 45262481);    srec.epc:=2;
   DBMS_STATS.prepare_column_values (srec, numvals);    DBMS_STATS.set_column_stats (ownname => user, tabname => 'T1', colname => 'ID', distcnt => 45262481, density => .0000000220933536542109,
                                nullcnt => 0, srec => srec, avgclen => 6);
   charvals := DBMS_STATS.chararray ('', '');    DBMS_STATS.prepare_column_values (srec, charvals);    DBMS_STATS.set_column_stats (ownname => user, tabname => 'T1', colname => 'V', distcnt => 0, density => 0, nullcnt => 45262481, srec => srec, avgclen => 0);    COMMIT;    DBMS_STATS.SET_INDEX_STATS ( ownname => user, indname =>'PK_ID', numrows => 45607914, numlblks => 101513,

               numdist => 45607914, avglblk => 1, avgdblk => 1, clstfct => 33678879, indlevel => 2); END;
/

variable N1 NVARCHAR2(32)
variable N2 NUMBER
begin
:N1 := 'D';
:N2 := 50;
end;
/

select /*+ gather_plan_statistics */ * from ( SELECT A.ID COL0,A.ID COL1 FROM T1 A
WHERE A.V=:N1 AND A.ID > 1
ORDER BY A.ID ) where rownum <= :N2 ;

SQL_ID 871kkxamgy1wy, child number 0



select /*+ gather_plan_statistics */ * from ( SELECT A.ID COL0,A.ID COL1 FROM T1 A WHERE A.V=:N1 AND A.ID > 1 ORDER BY A.ID ) where rownum <= :N2

Plan hash value: 2577482738


| Id  | Operation                     | Name  | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |      1 |        |       |  3736 (100)|      0 |00:00:00.01 |       1 |

|* 1 | COUNT STOPKEY | | 1 | | | | 0 |00:00:00.01 | 1 |
| 2 | VIEW | | 1 | 50 | 1300 | 3736 (1)| 0 |00:00:00.01 | 1 |
|* 3 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 452K| 20M| 3736 (1)| 0 |00:00:00.01 | 1 |
|* 4 | INDEX RANGE SCAN | PK_ID | 1 | 5000 | | 14 (0)| 0 |00:00:00.01 | 1 |
----------------------------------------------------------------------------------------------------------------------

Peeked Binds (identified by position):


   2 - :2 (NUMBER): 50 Predicate Information (identified by operation id):


   1 - filter(ROWNUM<=:N2)
   3 - filter(SYS_OP_C2C("A"."V")=:N1)
   4 - access("A"."ID">1)

So for some reason the optimizer thinks it will get 5000 rows from the index range scan using ID > 1. When using following query, the optimizer does know that ID > 1 gives about 45 million rows:

SELECT /*+ INDEX (A PK_ID) */ A.ID COL0 FROM T1 A WHERE A.ID > 1; SQL_ID 1bndnsjhpsvbs, child number 0



SELECT /*+ INDEX (A PK_ID) */ A.ID COL0 FROM T1 A WHERE A.ID > 1 Plan hash value: 988767100

| Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)|
|   0 | SELECT STATEMENT |       |        |       |   101K(100)|

|* 1 | INDEX RANGE SCAN| PK_ID | 45M| 258M| 101K (1)|
----------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - access("A"."ID">1)

So why is the former query giving an estimate of only 5000 rows?

Regards,
Dirk

Note: bind variable N1 is on purpose defined as NVARCHAR2 as that is also what our real case is doing.

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jan 19 2021 - 19:09:56 CET

Original text of this message