Re: Why the "where NOT EXISTS" does not work?

From: Thomas Kyte <tkyte_at_us.oracle.com>
Date: 26 Oct 2001 14:15:10 -0700
Message-ID: <9rcjou0233f@drn.newsguy.com>

In article <ttjdh7il3c9tc9_at_corp.supernews.com>, "Sybrand says...
>
>
>"Roy Sette" <kl7_at_rahul.net> wrote in message
>news:9rc8md$o4u$1_at_samba.rahul.net...
>> Our distinguished colleague, "C Chang" scribed:
>>
>> >Which Proccess is more efficient?
>>
>> >select a from A_Table
>> >minus
>> >select a from B_table
>>
>> >or
>>
>> >select a from test1
>> >where not exists
>> >(select 1 from test2
>> > where test1.a = test2.b)
>>
>>
>> The second construction is a correlated subquery, which I've
>> been told is inefficient, owing to exhaustive comparisons.
>> Does the 'minus' set operator do any optimization? - not sure.
>>
>>
>> --
>> Roy
>> kl7_at_rahul.net (USA)
>
>
>The MINUS involves a sort merge operation
>
>The general scenario for 1 is
>perform a full table scan on table_a using db_file_multiblock_read_count (so
>read 64k at a time in a well-tuned database)
>do the same for table b
>sort merge them removing duplicates
>
>The second scenario performs lookups of individual keys, one block at a
>time.
>The I/O for 1 will be way lower than the I/O for 2.
>
>Hth,
>Sybrand Bakker
>Senior Oracle DBA
>
>
>

As always however -- it depends. The only absolute I know of is "there are no absolutes" (caveated with "maybe")

It'll depend on

size of test1 (after other predicates if any have been applied) size of test2
nullality of the column being compared
and so on....

for example:

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> create table test1 2 as
3 select * from all_objects where rownum <= 100;

Table created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM>
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> create table test2 2 as
3 select * from all_objects;

Table created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM>
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> create index test1_idx on test1(timestamp);

Index created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> create index test2_idx on test2(timestamp);

Index created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> 
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> 
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> analyze table test1

  2 compute statistics
  3 for table
  4 for all indexes
  5 for all indexed columns;

Table analyzed.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM>
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> analyze table test2   2 compute statistics
  3 for table
  4 for all indexes
  5 for all indexed columns;

Table analyzed.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> 
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> 
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> set autotrace traceonly
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> 
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> select timestamp from test1

  2 minus
  3 select timestamp from test2
  4 /

no rows selected

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=80 Card=17082 Bytes=324558) 1 0 MINUS

   2    1     SORT (UNIQUE) (Cost=3 Card=100 Bytes=1900)
   3    2       TABLE ACCESS (FULL) OF 'TEST1' (Cost=1 Card=100 Bytes=1900)
   4    1     SORT (UNIQUE) (Cost=77 Card=16982 Bytes=322658)
   5    4       TABLE ACCESS (FULL) OF 'TEST2' (Cost=7 Card=16982 Bytes=322658)

Statistics

          0  recursive calls
         27  db block gets
        243  consistent gets
          0  physical reads
          0  redo size
        217  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          0  rows processed

ops$tkyte_at_ORA717DEV.US.ORACLE.COM>
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> select timestamp from test1 2 where not exists ( select 1 from test2 where test2.timestamp = test1.timestamp )
3 /

no rows selected

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=5 Bytes=95) 1 0 FILTER

   2    1     TABLE ACCESS (FULL) OF 'TEST1' (Cost=1 Card=5 Bytes=95)
3    1     INDEX (RANGE SCAN) OF 'TEST2_IDX' (NON-UNIQUE) (Cost=1 Card=50

Bytes=950)

Statistics

          0  recursive calls
         12  db block gets
        112  consistent gets
          0  physical reads
          0  redo size
        217  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          0  rows processed

Here because test1 is small and timestamp is NULLABLE (hence the minus cannot use the indexes) the where not exists is "better".

In all -- it depends, not exists can toast minus, minus can kill not exists. Just like "where x in versus where exists". Both have their uses, both can beat eachother under different circumstances.

It is interesting to NOTE that where not exists and minus are NOT substitutable for eachother in general:

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> create table t1 ( x int );

Table created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> create table t2 ( x int );

Table created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM>
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> insert into t1 values ( 1 );

1 row created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> insert into t1 values ( 1 );

1 row created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> insert into t1 values ( 2 );

1 row created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> insert into t2 values ( 2 );

1 row created.

ops$tkyte_at_ORA717DEV.US.ORACLE.COM>
ops$tkyte_at_ORA717DEV.US.ORACLE.COM> select * from t1 minus select * from t2;

ops$tkyte_at_ORA717DEV.US.ORACLE.COM> select * from t1 where not exists ( select null from t2 where t1.x = t2.x );

         1
         1

they give different answers! watch out for that.

--
Thomas Kyte (tkyte@us.oracle.com)             http://asktom.oracle.com/ 
Expert one on one Oracle, programming techniques and solutions for Oracle.
http://www.amazon.com/exec/obidos/ASIN/1861004826/  
Opinions are mine and do not necessarily reflect those of Oracle Corp

Received on Fri Oct 26 2001 - 16:15:10 CDT