Re: Why does NOT IT take so long

From: Pedro Bandeira <pbandei>
Date: 1995/07/29
Message-ID: <3vdqsq$4q_at_midgard.calvacom.fr>#1/1


"C.J. Jardine" <cj10_at_cam.ac.uk> wrote:
>Rob Verlander <robv_at_accessone.com> wrote:
>>tconder_at_csc.com (Tom Conder) wrote:
>>>When writing Oracle SQL, why does the NOT IN statement take so long?
>>
>>A NOT IN clause will always result in a full table scan and any indexes
>>that you may have will be IGNORED.
>
>I agree that Oracle always seems to ignore indexes when evaluating
>NOT IN clauses. I am not convinced that this is necessary.
>
>Consider the following two queries:
>
>1. SELECT x FROM t1 WHERE x NOT IN (SELECT y FROM t2)
>
>2. SELECT x FROM t1 WHERE NOT EXISTS (SELECT 1 FROM t2 WHERE y = x )
>
>If t2 is indexed by y, query 2 will use this index, whereas query 1
>will not. It will make repeated full scans over t2.We all learn,
>after a bit, never to write query 1.
>

Right. The NOT EXISTS version should usually be preferred, and huge performance gains can be obtained.

>In fact the two queries do not mean the same. The difference is
>to do with NULLs. Query 1 is in fact the same as:
>
>3. SELECT x FROM t1 WHERE NOT EXISTS (SELECT 1 FROM t2 WHERE y = x )
> AND NOT EXISTS (SELECT 1 FROM t2 WHERE y IS NULL)
>
>I suspect the reason why query 1 never uses the index is that NULLs
>are not represented in the index.
>

Exactly. The operator NOT IN is data dependant because

  1. NOT IN is interpreted as != ALL
  2. just as a value can never be equal to null, so a value can never not be equal to null => the presence of null columns in a NOT IN (subquery ) prevents any rows being returned.

>My gripe is about the case when there is a NOT NULL constraint on
>column y. In this case, queries 1 and 2 are equivalent. A particularly
>common example is when y is a primary key.
>
>Even if there isn't a NOT NULL constraint, re-writing 1 as 3 makes it
>go faster (try it). The first sub-query can be evaluated using the index,
>and the second is not correlated. It is evaluated by a single full scan.
>
>My conclusion is that the slowness of NOT IN is not necessary. It is
>caused by incompetent optimisation.
>
>Would anyone from Oracle care to comment?

The Oracle optimizer of a (near) future version will handle this automatically.

>
>Charles Jardine, Univ of Cambridge UK
>

Regards

     Pedro Received on Sat Jul 29 1995 - 00:00:00 CEST

Original text of this message