Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: NOT IN/Nulls/ NOT EXISTS

Re: NOT IN/Nulls/ NOT EXISTS

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Sat, 07 Jul 2007 19:53:50 -0700
Message-ID: <1183863230.670483.50930@q75g2000hsh.googlegroups.com>


On Jul 6, 4:07 am, colmkav <colmj..._at_yahoo.co.uk> wrote:

> which would you normally expect to work quicker and why? ie how does
> the using of NOT EXISTS compare with the using of Nulls
>
> 1)
> select count(*) from (
> SELECT distinct sourceid, strategy, strategyname, area, product
>       FROM t_bookmap
>       WHERE SourceId = 1 AND not exists(select strategy from
> ta_strategy where SourceId = 1 AND TA_STRATEGY.strategy =
> T_BOOKMAP.strategy)
> )
>
> 2)
> select count (*) from (
> SELECT distinct T.sourceid, T.strategy, T.strategyname, T.area,
> T.product
>       FROM t_bookmap T, (select strategy from ta_strategy where
> sourceid = 1) TA
>       WHERE T.SourceId = 1 AND TA.strategy(+) = T.strategy AND
> TA.strategy is NULL
> )

I believe that I answered this question in a different response to you, but did not provide a test case for verification: http://groups.google.com/group/comp.databases.oracle.misc/browse_thread/thread/0bf1746f44d29ac0

Scott offered a good suggestion - test the performance, ideally with a representative data set.

The answer will depend in part on the Oracle version. You will typically find that Oracle is able to transform the SQL statement into a more efficient form. The tests below were performed on Oracle 10.2.0.3

Let's set up two tables with known data, one with 100,000 rows and one with 50,000 rows:
CREATE TABLE T5 (

  MY_DATE DATE NOT NULL,
  MY_NUMBER NUMBER(12,10) NOT NULL,
  MY_ROW NUMBER(12) NOT NULL);

CREATE TABLE T6 (

  MY_DATE DATE NOT NULL,
  MY_NUMBER NUMBER(12,10) NOT NULL,
  MY_ROW NUMBER(12) NOT NULL);

INSERT INTO
  T5
SELECT
  TRUNC(SYSDATE)+ROWNUM,
  COS(ROWNUM/180*3.141592),
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=100000;

COMMIT; INSERT INTO
  T6
SELECT
  TRUNC(SYSDATE)+ROWNUM*2,
  COS(ROWNUM/180*3.141592),
  ROWNUM
FROM
  DUAL
CONNECT BY
  LEVEL<=50000;

COMMIT; Now, let's look at the first of the options, using NOT IN syntax that I cautioned you against using in the other thread. For the first run, let's now allow Oracle to transform the SQL statement into another form:

SELECT /*+ NO_QUERY_TRANSFORMATION */

  T5.MY_DATE,
  T5.MY_NUMBER,
  T5.MY_ROW

FROM
  T5
WHERE
  T5.MY_DATE NOT IN (
    SELECT DISTINCT
      T6.MY_DATE
    FROM
      T6);

The DBMS Xplan looks like this:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers |
|*  1 |  FILTER            |      |      1 |        |  50000 |
00:05:45.70 |      14M|
|   2 |   TABLE ACCESS FULL| T5   |      1 |    104K|    100K|
00:00:00.30 |     872 |
|*  3 |   TABLE ACCESS FULL| T6   |    100K|    543 |  50000 |
00:05:44.64 |      14M|

-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - filter( IS NULL)
   3 - filter("T6"."MY_DATE"=:B1)

The above query required 5 minutes and 45 seconds to execute, which is how the SQL statement would likely execute on Oracle 8i. The Starts value of 100K indicates that the table T6 was full tablescanned 100,000 times - once per row in the table T5. 5 minutes and 45 seconds is a long execution time for this SQL statement.

Let's try again, allowing Oracle to transform the SQL statement: SELECT

  T5.MY_DATE,
  T5.MY_NUMBER,
  T5.MY_ROW

FROM
  T5
WHERE
  T5.MY_DATE NOT IN (
    SELECT DISTINCT
      T6.MY_DATE
    FROM
      T6);

The DBMS Xplan looks like this:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers | OMem | 1Mem | Used-Mem |

|* 1 | HASH JOIN RIGHT ANTI| | 1 | 49944 | 50000 | 00:00:01.43 | 1075 | 1821K| 1267K| 2202K (0)|
|   2 |   TABLE ACCESS FULL  | T6   |      1 |  54261 |  50000 |
00:00:00.10 |     203 |       |       |          |
|   3 |   TABLE ACCESS FULL  | T5   |      1 |    104K|    100K|
00:00:00.20 |     872 |       |       |          |

------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - access("T5"."MY_DATE"="T6"."MY_DATE")

Note the change in the plan. Oracle transformed the SQL statement into an anti-join which completed in 1.43 seconds. Remember that plan, we might see it again.

Let's see what happens with an outer join as I recommended in the other thread. First, we will not allow Oracle to transform the SQL statement:
SELECT /*+ NO_QUERY_TRANSFORMATION */

  T5.MY_DATE,
  T5.MY_NUMBER,
  T5.MY_ROW

FROM
  T5,
  T6
WHERE
  T5.MY_DATE=T6.MY_DATE(+)
  AND T6.MY_DATE IS NULL; The DBMS Xplan looks like this:

| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers | OMem | 1Mem | Used-Mem |
|*  1 |  FILTER                |      |      1 |        |  50000 |
00:00:01.83 |    1075 |       |       |          |
|*  2 |   HASH JOIN RIGHT OUTER|      |      1 |    100K|    100K|
00:00:01.53 |    1075 |  1821K|  1267K| 2242K (0)|
|   3 |    TABLE ACCESS FULL   | T6   |      1 |  50000 |  50000 |
00:00:00.10 |     203 |       |       |          |
|   4 |    TABLE ACCESS FULL   | T5   |      1 |    100K|    100K|
00:00:00.20 |     872 |       |       |          |

--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - filter("T6"."MY_DATE" IS NULL)
   2 - access("T5"."MY_DATE"="T6"."MY_DATE")

This query required 1.83 seconds, which is a fraction of a second longer than the previous version that Oracle transformed, much better than the 5 minutes and 45 seconds of the first run.

This time, let's let Oracle transform the outer join: SELECT

  T5.MY_DATE,
  T5.MY_NUMBER,
  T5.MY_ROW

FROM
  T5,
  T6
WHERE
  T5.MY_DATE=T6.MY_DATE(+)
  AND T6.MY_DATE IS NULL; The DBMS Xplan looks like this:

| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers | OMem | 1Mem | Used-Mem |

|* 1 | HASH JOIN RIGHT ANTI| | 1 | 50000 | 50000 | 00:00:01.43 | 1075 | 1821K| 1267K| 2194K (0)|
|   2 |   TABLE ACCESS FULL  | T6   |      1 |  50000 |  50000 |
00:00:00.10 |     203 |       |       |          |
|   3 |   TABLE ACCESS FULL  | T5   |      1 |    100K|    100K|
00:00:00.20 |     872 |       |       |          |

------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - access("T5"."MY_DATE"="T6"."MY_DATE")

The plan and times are identical to the first transformation of the NOT IN syntax.

Let's try the NOT EXISTS syntax, but not let Oracle transform the SQL statement:
SELECT /*+ NO_QUERY_TRANSFORMATION */

  T5.MY_DATE,
  T5.MY_NUMBER,
  T5.MY_ROW

FROM
  T5
WHERE
  NOT EXISTS (
    SELECT
      T6.MY_DATE
    FROM
      T6
    WHERE
      T6.MY_DATE=T5.MY_DATE); The DBMS Xplan looks like this:

| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers |
|*  1 |  FILTER            |      |      1 |        |  50000 |
00:05:43.30 |      14M|
|   2 |   TABLE ACCESS FULL| T5   |      1 |    100K|    100K|
00:00:00.30 |     872 |
|*  3 |   TABLE ACCESS FULL| T6   |    100K|      1 |  50000 |
00:05:42.21 |      14M|

-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - filter( IS NULL)
   3 - filter("T6"."MY_DATE"=:B1)

5 minutes and 43 seconds to execute, which is very close the nontransformed  NOT IN syntax - the plan looks quite similar also.

Let's try the NOT EXISTS syntax, but this time let Oracle transform the SQL statement:
SELECT

  T5.MY_DATE,
  T5.MY_NUMBER,
  T5.MY_ROW

FROM
  T5
WHERE
  NOT EXISTS (
    SELECT
      T6.MY_DATE
    FROM
      T6
    WHERE
      T6.MY_DATE=T5.MY_DATE); The DBMS Xplan looks like this:

| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers | OMem | 1Mem | Used-Mem |

|* 1 | HASH JOIN RIGHT ANTI| | 1 | 50000 | 50000 | 00:00:01.43 | 1075 | 1821K| 1267K| 2194K (0)|
|   2 |   TABLE ACCESS FULL  | T6   |      1 |  50000 |  50000 |
00:00:00.10 |     203 |       |       |          |
|   3 |   TABLE ACCESS FULL  | T5   |      1 |    100K|    100K|
00:00:00.20 |     872 |       |       |          |

------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - access("T6"."MY_DATE"="T5"."MY_DATE")

The plan and times are identical to the first transformation of the NOT IN syntax, and of course the outer join syntax.

In summary, unassisted by Oracle's transformations, the outer join syntax executes in the least amount of time. Allowing Oracle transformations, all methods execute using the same plan.

Now, what if we allow NULL values?
ALTER TABLE T5 MODIFY (MY_DATE NULL);
ALTER TABLE T6 MODIFY (MY_DATE NULL); INSERT INTO
  T5
SELECT
  NULL,
  COS((ROWNUM-1)/180*3.141592),
  ROWNUM-1
FROM
  DUAL
CONNECT BY
  LEVEL<=1;

INSERT INTO
  T6
SELECT
  NULL,
  COS((ROWNUM-1)/180*3.141592),
  ROWNUM-1
FROM
  DUAL
CONNECT BY
  LEVEL<=1;

I will leave this experiment to you, but you will likely find that the transformations available to Oracle have changed. To get you started, here is the DBMS Xplan for the NOT IN syntax, even when allowing Oracle to transform the SQL statement:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers |
|*  1 |  FILTER            |      |      1 |        |  50000 |
00:06:33.41 |      14M|
|   2 |   TABLE ACCESS FULL| T5   |      1 |    100K|    100K|
00:00:00.30 |     872 |
|*  3 |   TABLE ACCESS FULL| T6   |    100K|      1 |  50000 |
00:06:32.36 |      14M|

-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - filter( IS NULL)
   3 - filter(LNNVL("T6"."MY_DATE"<>:B1))

6 minutes and 33 seconds, a bit longer than the original 5 minutes and 45 seconds

Here is the plan for the outer join syntax with the table modification that allows NULLS:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers | OMem | 1Mem | Used-Mem |

|* 1 | HASH JOIN RIGHT ANTI| | 1 | 50000 | 50000 | 00:00:01.38 | 1075 | 1821K| 1267K| 2198K (0)|
|   2 |   TABLE ACCESS FULL  | T6   |      1 |  50000 |  50000 |
00:00:00.10 |     203 |       |       |          |
|   3 |   TABLE ACCESS FULL  | T5   |      1 |    100K|    100K|
00:00:00.20 |     872 |       |       |          |

------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   1 - access("T5"."MY_DATE"="T6"."MY_DATE")

1.38 seconds.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc. Received on Sat Jul 07 2007 - 21:53:50 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US