Re: More Variations on NOT IN problem

From: Marc de Brouwer <mbrouwer_at_nl.oracle.com>
Date: 1995/08/01
Message-ID: <3vm48n$j9e_at_nlsu110.nl.oracle.com>#1/1


Pedro Bandeira (pbandei) wrote:

: <esanderl_at_explorer.csc.com> wrote:

: >I have more variations on the use of NOT IN and NOT EXISTS;
: >I ran both through the optimizer and got the same result;
: >the first select is extrememly slow but returns the correct answer;
: >the second runs fast but doesn't return any rows:
: >
: >(1):
: >explain plan set statement_id = 'SELECT1'
: >for
: >select distinct b.crr_itn_plnd_glc_cd from carrier_itinerary b
: > where not exists
: > (select distinct a.op_rtg_glc_cd from oplan_force_rqmt_loc a
: > where a.op_pln_id = '120AX' and
: > a.op_rtg_glc_cd = b.crr_itn_plnd_glc_cd
: > )
: > order by 1;
: >
: >(2):
: >explain plan set statement_id = 'SELECT2'
: >for
: >select distinct b.crr_itn_plnd_glc_cd from carrier_itinerary b
: > where b.crr_itn_plnd_glc_cd NOT IN
: > (select distinct a.op_rtg_glc_cd from oplan_force_rqmt_loc a
: > where a.op_pln_id = '120AX'
: > )
: > order by 1;
: >
: >Both return the following from the optimizer:
: >
: >ID OPERATION OPTIONS OBJECT_NAME Parent_ID POSITION
: >-- ------------------ -------- --------------------- ---- -----
: >0 SELECT STATEMENT 11
: >1 SORT UNIQUE 0 1
: >2 FILTER 1 1
: >3 TABLE ACCESS FULL CARRIER_ITINERARY 2 1
: >4 TABLE ACCESS BY ROWID OPLAN_FORCE_RQMT_LOC 2 2
: >5 INDEX RANGE SCAN pk_opfrq_loc 4 1
: >
: >6 rows selected.
: >
: >SQL>
: >
: >
: >Any Ideas????

I'm pretty sure the reason why the second version returns no rows is that the column from the subquery table (oplan_force_rqmt_loc) contains nulls.
A NOT IN with a subquery returning nulls always fails (ie: is a logical NULL) often causing the main query to return NO RECORDS. I've had some lengthy discussions about the subject in this newsgroup on the behaviour of [NOT] EXISTS vs. [NOT] IN, but in the end the way Oracle interprets them is definitely logically sound and consistent.
Just imagine a "x NOT IN (Y)" as a long series of "x <> y1 AND x <> y2 AND x <> y3 AND ... AND x <> yn". Now substitute NULL for one of the y's and work out the result. See what I mean? The entire result becomes NULL.

Hope this sheds some light on the subject. Also, the DISTINCT clause in the subquery is of no use in both examples. The plan however shows that it's not really hurting either. The distinct in the main query is causing the sort operation mentioned in the plan. If you don't really need it, I would suggest removing it for better performance.

There is another option to your select problem, which may be faster or slower, depending on indexing and size of your tables. Here it goes:

select b.crr_itn_plnd_glc_cd
from carrier_itinerary b
minus
select a.op_rtg_glc_cd
from oplan_force_rqmt_loc a
where a.op_pln_id = '120AX'

Just couldn't resist: I love set solutions. Benefit: this one does no loops or joins. No distinct needed either. Drawback: it sorts the first table completely and part of the second one.

Hope I helped.
Greets,
Marc.

Received on Tue Aug 01 1995 - 00:00:00 CEST

Original text of this message