Re: not in VS not exists

From: R.W. Fairbairn <richard.fairbairn_at_tesco.net>
Date: Sun, 14 Nov 1999 18:03:25 -0000
Message-ID: <80mtjp$38q$1@epos.tesco.net>

Many thanks for this comprehensive response. It is very much appreciated.

Thomas Kyte wrote in message <6ccuOHl1o2azn3QBxxLasW4Ibdiz_at_4ax.com>...
>A copy of this was sent to "R.W. Fairbairn" <richard.fairbairn_at_tesco.net>
>(if that email address didn't require changing)
>On Sun, 14 Nov 1999 10:27:00 -0000, you wrote:
>
>>What is the difference between these two?
>>
>
>
>Logically, they operate differently. consider:
>
>scott_at_8.0> create table t1 ( x int );
>Table created.
>
>scott_at_8.0> create table t2 ( y int );
>Table created.
>
>scott_at_8.0> insert into t1 values ( 1 );
>scott_at_8.0> insert into t2 values ( 2 );
>scott_at_8.0> insert into t2 values ( null );
>
>scott_at_8.0> select * from t1 where x not in ( select y from t2 );
>no rows selected
>
>scott_at_8.0> select * from t1 where not exists ( select 1 from t2 where
y=x );
>
> X
>----------
> 1
>
>
>The first one returns zero rows always (due to the inclusion of a NULL in
the
>subquery). Tri-valued logic returns "unknown" when it asks "is x not in
NULL
>)". therefore, no rows are returned.
>
>In the second query -- it is asking a much more specific question -- Return
a
>row such that there exists NO exact match in the other table.
>
>
>So, logically they are very different. Physically they are different as
well
>(how they get processed). Typically a NOT IN is very expensive to process
as
>the subquery tends to get evaluated once per row from the outer query (with
RBO
>this is the case, with the CBO -- it might not be but usually is). The
entire
>set of values to 'not be in' is generated. You can see this with a query
like:
>
>tkyte_at_8.0> create table emp as select * from scott.emp;
>Table created.
>
>tkyte_at_8.0> create index emp_sal_idx on emp(sal);
>Index created.
>
>tkyte_at_8.0> set autotrace on
>
>tkyte_at_8.0> select * from emp where sal not in ( select sal from emp );
>no rows selected
>
>
>Execution Plan
>----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 FILTER
> 2 1 TABLE ACCESS (FULL) OF 'EMP'
> 3 1 TABLE ACCESS (FULL) OF 'EMP'
>
>
>Statistics
>----------------------------------------------------------
> 0 recursive calls
> 45 db block gets
> 15 consistent gets
>
>tkyte_at_8.0> select * from emp where not exists ( select 1 from emp b where
b.sal
>= emp.sal );
>
>no rows selected
>
>
>Execution Plan
>----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 FILTER
> 2 1 TABLE ACCESS (FULL) OF 'EMP'
> 3 1 INDEX (RANGE SCAN) OF 'EMP_SAL_IDX' (NON-UNIQUE)
>
>
>Statistics
>----------------------------------------------------------
> 0 recursive calls
> 3 db block gets
> 3 consistent gets
>
>
>The not exists -- using the index -- did less work here (3 db block gets vs
45).
>On a large subquery, or a subquery that is expensive to compute, this can
be a
>very large performance impact.
>
>If the logic permits it, i prefer to use NOT EXISTS when possible. Another
>alternative is an outer join. Using t1 and t2 again from above with
different
>data we see:
>
>
>tkyte_at_8.0> insert into t1 values (1);
>tkyte_at_8.0> insert into t1 values (2);
>tkyte_at_8.0> insert into t2 values (2);
>tkyte_at_8.0> insert into t2 values (null);
>
>
>tkyte_at_8.0> select t1.* from t1, t2 where t1.x = t2.y(+) and t2.y is null;
>
> X
>----------
> 1
>
>tkyte_at_8.0> select * from t1 where x not in ( select y from t2 );
>
>no rows selected
>
>tkyte_at_8.0> select * from t1 where not exists ( select 1 from t2 where
y=x );
>
> X
>----------
> 1
>
>
>An outer join works much like a NOT EXISTS logically. The NOT EXISTS may
be
>more performant however since it stops evaluating the subquery as soon as
it
>finds a row that does match whereas the join will evaulate all rows and
then
>throw out matches.
>
>--
>See http://osi.oracle.com/~tkyte/ for my columns 'Digging-in to
Oracle8i'...
>Current article is "Part I of V, Autonomous Transactions" updated June
21'st
>
>Thomas Kyte tkyte_at_us.oracle.com
>Oracle Service Industries Reston, VA USA
>
>Opinions are mine and do not necessarily reflect those of Oracle
Corporation Received on Sun Nov 14 1999 - 12:03:25 CST