RE: Query Transformation

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Wed, 17 Feb 2021 11:41:58 -0500
Message-ID: <3ef001d7054b$cf8cb470$6ea61d50$_at_rsiz.com>



From the CBO perspective the number of blocks estimated to be required to read rather than the ultimate number of rows returned is more important to making cost decisions. This is because rows are stored in blocks and reading or not reading a block is a binary operation regardless of the number of rows in the block.  

Short of reading the 10053, it might be useful to you in pursuing the logic of the 10053 to deliver the number of distinct table blocks from each row source the CBO has ultimately chosen and add that to the number of index branch and leaf blocks that must have been read to get the keys to probe the tables. (The CBO uses the cluster factor to estimate that, I’m suggesting you uses changes in session statistics or dump the index blocks and pretend you’re doing it by hand, remembering that you don’t need to revisit the tree or leaf for additional rows you for which you have the index leaf block in hand from a previous row retrieval via the index [which is why the CBO cares about cluster factor]).  

Good luck. I suspect the CBO answer for this query may change if you ramp up the number of rows significantly and the number of blocks diverges from the number of rows.  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Patrick Jolliffe Sent: Tuesday, February 16, 2021 9:15 AM To: oracle-l
Cc: Jonathan Lewis; Noveljic Nenad
Subject: Re: Query Transformation  

Been looking at this on and off for the last few weeks. Below is my simplest testcase, it reproduces on my plain 19c Docker Image.

Getting a bit bogged down in the exact species of OR expansion it is making and why.

I'm going to keep looking, but thought I'd share my (small) progress.

Patrick    

drop table l purge;  

create table l nologging

as

select rownum id1,

       rownum id2

from dual

connect by level <= 10;  

alter table l add constraint l_id1 primary key(id1);  

exec dbms_stats.gather_table_stats(null, 'l');  

drop table t1 purge;  

create table t1 nologging

as

select ROWNUM id1,

        MOD(ROWNUM, 5) t1f

from dual connect by rownum <= 10;

create index t1_t1f_id1 ON t1 (t1f, id1);  

exec dbms_stats.gather_table_stats(null, 't1');  

drop table t2 purge;  

create table t2 nologging

as

select ROWNUM id2,

        MOD(ROWNUM, 8) t2f1,

        MOD(ROWNUM, 8) t2f2

from dual

connect by level <= 10000;  

alter table t2 add constraint t2_id2 primary key (id2);  

create index t2_t2f1_t2f2_id2 on t2(t2f1, t2f2, id2);  

exec dbms_stats.gather_table_stats(null, 't2')  

explain plan for

select /*+ OPT_PARAM('_optimizer_cbqt_or_expansion' 'off') */ null

  from t1

  join l on l.id1 = t1.id1

  join t2 on t2.id2 = l.id2

  where ( ( t1.t1f = 0 AND t2.t2f1 BETWEEN 9 AND 10) OR

           ( t1.t1f = 1 AND t2.t2f1 = 14 AND t2.t2f2 = 13 ) );  

select * from dbms_xplan.display();    


| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |



| 0 | SELECT STATEMENT | | 3 | 66 | 10 (0)| 00:00:01 |
| 1 | CONCATENATION | | | | | |
| 2 | NESTED LOOPS | | 1 | 22 | 4 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 22 | 4 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 16 | 3 (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN           | T2_T2F1_T2F2_ID2 |     1 |    10 |     2   (0)| 00:00:01 |

| 6 | INLIST ITERATOR | | | | | |
|* 7 | INDEX RANGE SCAN | T1_T1F_ID1 | 1 | 6 | 1 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | L_ID1 | 1 | | 0 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID | L | 1 | 6 | 1 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | 1 | 22 | 3 (0)| 00:00:01 |
| 11 | NESTED LOOPS | | 1 | 12 | 2 (0)| 00:00:01 |
|* 12 | INDEX RANGE SCAN | T1_T1F_ID1 | 1 | 6 | 1 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID | L | 1 | 6 | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | L_ID1 | 1 | | 0 (0)| 00:00:01 | |* 15 | INDEX RANGE SCAN | T2_T2F1_T2F2_ID2 | 1 | 10 | 1 (0)| 00:00:01 |
| 16 | NESTED LOOPS | | 1 | 22 | 3 (0)| 00:00:01 |
| 17 | NESTED LOOPS | | 1 | 22 | 3 (0)| 00:00:01 |
| 18 | NESTED LOOPS | | 1 | 12 | 2 (0)| 00:00:01 |
|* 19 | INDEX RANGE SCAN | T1_T1F_ID1 | 1 | 6 | 1 (0)| 00:00:01 |
| 20 | TABLE ACCESS BY INDEX ROWID| L | 1 | 6 | 1 (0)| 00:00:01 |
|* 21 | INDEX UNIQUE SCAN | L_ID1 | 1 | | 0 (0)| 00:00:01 | |* 22 | INDEX UNIQUE SCAN | T2_ID2 | 1 | | 0 (0)| 00:00:01 | |* 23 | TABLE ACCESS BY INDEX ROWID | T2 | 1 | 10 | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T2"."T2F1"=14 AND "T2"."T2F2"=13) 7 - access("T1"."T1F"=0 OR "T1"."T1F"=1) filter("T1"."T1F"=0 AND "T2"."T2F1">=9 AND "T2"."T2F1"<=10 OR "T1"."T1F"=1 AND "T2"."T2F1"=14 AND "T2"."T2F2"=13) 8 - access("L"."ID1"="T1"."ID1") 9 - filter("T2"."ID2"="L"."ID2") 12 - access("T1"."T1F"=1) filter("T1"."T1F"=0 OR "T1"."T1F"=1) 14 - access("L"."ID1"="T1"."ID1") 15 - access("T2"."T2F1"=14 AND "T2"."T2F2"=13 AND "T2"."ID2"="L"."ID2") filter("T2"."T2F1">=9 AND "T2"."T2F1"<=10 AND (LNNVL("T2"."T2F1"=14) OR LNNVL("T2"."T2F2"=13))) 19 - access("T1"."T1F"=0) filter("T1"."T1F"=0 OR "T1"."T1F"=1) 21 - access("L"."ID1"="T1"."ID1")

  22 - access("T2"."ID2"="L"."ID2")
  23 - filter("T2"."T2F1">=9 AND "T2"."T2F1">=9 AND "T2"."T2F1"<=10 AND "T2"."T2F1"<=10 AND
              (LNNVL("T1"."T1F"=1) OR LNNVL("T2"."T2F1"=14) OR LNNVL("T2"."T2F2"=13)) AND
              (LNNVL("T2"."T2F1"=14) OR LNNVL("T2"."T2F2"=13)))

 

On Sat, 30 Jan 2021 at 18:52, Patrick Jolliffe <jolliffe_at_gmail.com> wrote:

I know I'm going to have to delve back into the 10053 trace at some point, but keep postponing that by working on test-case.

Below is progress so far if anyone is interested.

Continuing tomorrow

Patrick    

drop table l purge;  

create table l nologging

as

with generator as (

select --+ materialize

rownum id

from dual

connect by

level <= 1e4

)

select ROWNUM id,

to_char(MOD(ROWNUM, 10000), '99999999') id2 from

generator g1,

generator g2

where rownum <= 400000;  

alter table l add constraint l_pk primary key(id);

exec dbms_stats.gather_table_stats(null, 'l');  

drop table t1 purge;  

create table t1 nologging

as

with generator as (

select --+ materialize

rownum id

from dual

connect by

level <= 1e4

)

select

to_char(MOD(ROWNUM, 10000000), '99999999') c1,

MOD(ROWNUM, 1000000) l_id

from

generator v1,

generator v2

where

rownum <= 16000000;  

CREATE INDEX t1_I ON t1 (c1, l_id);  

exec dbms_stats.gather_table_stats(null, 't1');  

drop table t2 purge;  

create table t2 nologging

as

with generator as (

select --+ materialize

rownum id

from dual

connect by

level <= 1e4

)

select CAST(ROWNUM AS CHAR(16)) id,

to_char(MOD(ROWNUM, 1000), '9999') m1000,

to_char(MOD(ROWNUM, 2000), '9999') m2000

from

generator v1,

generator v2

where rownum <= 1000000;  

alter table t2 add constraint t2_pk primary key (id);

create index t2_i on t2(m1000, m2000, id);

exec dbms_stats.gather_table_stats(null, 't2')  

var v1 char(30)

var v2 char(30)  

exec :v1 := 'A';

exec :v2 := 'A';  

explain plan for SELECT /*+ OPT_PARAM('_optimizer_cbqt_or_expansion' 'off') */ null FROM l, t1, t2 WHERE

( ( t1.c1 = :v1 AND t2.m1000 BETWEEN 'X' AND 'Y') OR

( t1.c1 = :v2 AND t2.m1000 = 'Z' AND t2.m2000 = 'A' ) ) AND

( l.id = t1.l_id AND l.id2 = t2.id );  

select * from dbms_xplan.display();    

On Thu, 28 Jan 2021 at 15:00, Patrick Jolliffe <jolliffe_at_gmail.com> wrote:

You are correct in that it has nothing to do with NCHAR/NVARCHAR2 - replacing those in testcase still results in the 'crazy' plan  

On Thu, 28 Jan 2021 at 14:50, Patrick Jolliffe <jolliffe_at_gmail.com> wrote:

My bad, you need to stick the following lines at the beginning:  

var nc1 nchar(30)

var nc2 nchar(30)  

exec :nc1 := 'A';

exec :nc2 := 'A';    

On Thu, 28 Jan 2021 at 14:41, Patrick Jolliffe <jolliffe_at_gmail.com> wrote:

Thanks Jonathan,

(Hope I have ) Managed to reproduce it from scratch on a pristine 19.3 environment ('standard' Oracle docker image in this case).

Not hugely pleased about the SET_TABLE_STATS step, but we do what we can.

SQL for testcase and my output attached.

Going to keep on digging, just glad to have it confirmed that the plan is indeed crazy and it's not (only) my ability to understand it lacking.

Best Regards

Patrick  

On Thu, 28 Jan 2021 at 14:09, Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:  

I should have said that in my model both the varchar2() and nvarchar2() plans produced exactly the same 4-branch concatenation.

The indications were then that the generation of the branch code is purely formulaic and "unthinking" because in my case two of the branches had an access predicate of

    access("T1"."NV1"=U'0050')  

followed in one case by the silly filter predicate of

    filter(("T1"."NV1"<=U'0200' AND "T1"."NV1">=U'0100')  

and in the other case by the even sillier filter predicate:

    filter(("T1"."NV1"<=U'0200' AND "T1"."NV1">=U'0100') AND LNNVL("T1"."NV1"=U'0050')))  

Really the optimizer should have spotted the contradiction and filtered out these two branches  

This was all running 19.3.  

Regards

Jonathan Lewis      

On Wed, 27 Jan 2021 at 12:32, Patrick Jolliffe <jolliffe_at_gmail.com> wrote:

I *think* it's related to national characterset (NCHAR/NVARCHAR2). As shown below (I hope) current testcase does not reproduce if I switch to CHAR/VARCHAR2.

Investigation continues, trying to build complete reproducable testcase from scratch.    

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Feb 17 2021 - 17:41:58 CET

Original text of this message