Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: IN vs. EXISTS

RE: IN vs. EXISTS

From: Larry Elkins <elkinsl_at_flash.net>
Date: Sun, 20 Oct 2002 06:03:36 -0800
Message-ID: <F001.004EE53D.20021020060336@fatcity.com>


FWIW, in 9i the line blurs between IN and EXISTS (and NOT IN / NOT EXISTS), with the CBO deciding on a correlated or non-correlated approach regardless of whether you used IN or EXISTS. Following are 2 pairs of examples.

The first two queries are the "same", one using EXISTS, one using IN. With no restrictive criteria, the CBO chooses a hash approach for *both* the queries, exactly what I want.

The third and fourth queries are also the "same", one using EXISTS, one using IN. And there is very restrictive criteria on CODE_MASTER. Note that *both* queries utilize the same approach (nested loop semi). Once again, the plan I want, and it didn't matter if I used IN or EXISTS.

select *
from code_master cm
where code in (select code

               from code_detail)

SELECT STATEMENT Optimizer=CHOOSE (Cost=770 Card=99900 Bytes=1498500)   HASH JOIN (SEMI) (Cost=770 Card=99900 Bytes=1498500)     TABLE ACCESS (FULL) OF CODE_MASTER (Cost=77 Card=100000 Bytes=1100000)     INDEX (FAST FULL SCAN) OF CD_CODE_IDX (NON-UNIQUE) (Cost=208 Card=299600 Bytes=1198400)

select *
from code_master cm
where EXISTS (select null

              from   code_detail cd
              where  cd.code = cm.code)

SELECT STATEMENT Optimizer=CHOOSE (Cost=770 Card=99900 Bytes=1498500)   HASH JOIN (SEMI) (Cost=770 Card=99900 Bytes=1498500)     TABLE ACCESS (FULL) OF CODE_MASTER (Cost=77 Card=100000 Bytes=1100000)     INDEX (FAST FULL SCAN) OF CD_CODE_IDX (NON-UNIQUE) (Cost=208 Card=299600 Bytes=1198400)

select *
from code_master cm
where code in (select code

               from code_detail)
and foo_date = trunc(sysdate)-365

SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=15)   NESTED LOOPS (SEMI) (Cost=6 Card=1 Bytes=15)     TABLE ACCESS (BY INDEX ROWID) OF CODE_MASTER (Cost=4 Card=1 Bytes=11)       INDEX (RANGE SCAN) OF CM_FD_IDX (NON-UNIQUE) (Cost=3 Card=1)     INDEX (RANGE SCAN) OF CD_CODE_IDX (NON-UNIQUE) (Cost=2 Card=299600 Bytes=1198400)

select *
from code_master cm
where EXISTS (select null

              from   code_detail cd
              where cd.code = cm.code)

and foo_date = trunc(sysdate)-365

SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=15)   NESTED LOOPS (SEMI) (Cost=6 Card=1 Bytes=15)     TABLE ACCESS (BY INDEX ROWID) OF CODE_MASTER (Cost=4 Card=1 Bytes=11)       INDEX (RANGE SCAN) OF CM_FD_IDX (NON-UNIQUE) (Cost=3 Card=1)     INDEX (RANGE SCAN) OF CD_CODE_IDX (NON-UNIQUE) (Cost=2 Card=299600 Bytes=1198400)

And a quick word on NOT IN. The drawbacks people talk about were prior to Oracle 7.1. Starting with 7.1, both NOT IN and NOT EXISTS were handled in a correlated fashion, and indexes (if available) were used for the NOT IN. There is a minor difference in the ways nulls are handled. Starting with 7.3 when we got HASH joins, we also got HASH Anti Join capabilities. And if one were to set ALWAYS_ANTI_JOIN = HASH, you would still get a correlated approach with NOT EXISTS, and a HASH approach with NOT IN (assuming the criteria for being able to use a hash aj were met). Refer to Metalink Note: 28934.1 for more details.

And they each have their place. I've fixed many problem queries that used NOT EXISTS by using NOT IN instead and having the ALWAYS_ANTI_JOIN = HASH (or hash_aj hint if not set). By the same token, I've fixed many NOT IN's by going to a NOT EXISTS when a correlated approach was more appropriate. I have examples of each. And just like above with EXISTS and IN, the line blurs in 9i, with the CBO deciding to correlate or hash/merge regardless of whether you use NOT IN or NOT EXISTS. Also note that the always_semi_join and always_anti_join parameters became "undocumented" parameters in 9i -- now they are _always_semi_join and _always_anti_join.

Regards,

Larry G. Elkins
elkinsl_at_flash.net
214.954.1781

> -----Original Message-----
> From: root_at_fatcity.com [mailto:root_at_fatcity.com]On Behalf Of
> Rick_Cale_at_teamhealth.com
> Sent: Saturday, October 19, 2002 10:19 AM
> To: Multiple recipients of list ORACLE-L
> Subject: RE: IN vs. EXISTS
>
>
>
> Dennis,
> Thanks for the reply and congratulations but a little late. I have benn
> well above that "notch" for many years. Testing a single test case is not
> always conclusive. Certainly combing through countless documentation can
> prove to be somewhat futile. It took me quite a while to find what I did.
> The use of IN vs EXISTS should be very conclusive. Since we have so many
> "experts" on this list I was hoping to rely on that expertise/experience.
> Sometimes it's Ok to just get opinions which are hopefully based on fact.
> In this case I have gotten about a 50-50 split on best approach.
>
> Thanks for your comments.
> Rick

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Larry Elkins
  INET: elkinsl_at_flash.net

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Sun Oct 20 2002 - 09:03:36 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US