Re: Why does NOT IT take so long
Date: 1995/07/28
Message-ID: <3vaq0m$h9k_at_inet-nntp-gw-1.us.oracle.com>
dharris_at_uk.oracle.com (Dave Harris) wrote:
>snowden_at_haven.ios.com (Roger Snowden) wrote:
>>Jim Smith (Jim_at_jimsmith.demon.co.uk) wrote:
>>: In article <3uu80q$2nj_at_explorer.csc.com> tconder_at_csc.com "Tom Conder" writes:
>>: > When writing Oracle SQL, why does the NOT IN statement take so long?
>>: ... where this_id not in
>>: (select other_id from other table...)
>>I wonder if you could clarify this for me. I have a similar situation
>>where the sub-query takes less than two seconds in sqlplus to get 250
>>rows. The main table has about 100,000 rows. Does the subquery actually
>>execute for each row of the main table? For example:
>>select count(*) from dbown.donor where
>>dbown.donor.donorid not in (
>>select * from dbown.flagview)
>>dbown.flagview only has 250 rows. Does this not get chached and compared
>>to dbown.donor?
>>Roger
>the example:
>select count(*) from dbown.donor where
>dbown.donor.donorid not in (
>select * from dbown.flagview)
>this sql statement will cause 2 full table scans.
>the sql will be executed as follows:
>for every dbown.donor row perform a full table scan on
>dbown.flagview
>
>try using:
>select count(*)
>from dbown.donor d
>where not exists
> (select 1
> from dbown.flagview f
> where d.donorid = f.donorid)
>if the donorid is indexed on the dbown.flagview table then a full
>table scan can be avoided.
>Dave Harris
Perhaps the easiest ways to diagnose these types of issues is to use the TKPROF utility. For example, I created two tables, T1 and T2. Filled T1 with 27 rows and T2 with 1800 rows. Then I ran the following queries in sqlplus after issueing "alter session set sql_trace=true". Once you run tkprof on the trace file, you get a printout that shows the trace output at the bottom of this note.
All three queries I ran are "identical" in that they ask the same question. They are very different performance wise. As a rule of thumb for "not in" type of questions:
- if the table in the subquery is SMALL and the table on the outside is LARGE, not in may be very efficient since the entire table will get cached and subsequent scans will be of little cost
- if the table in the subquery is LARGE and the table on the outside is SMALL, not exists will be the best bet ASSUMING the columns you are "not in"ing on are indexed in the subquery table
- if the table in the subquery is LARGE and the table on the outside is LARGE, an outer join with a check for NULL on the 'inner' table may be best, ASSUMING again the columns you are "not in"ing on are indexed.
- if both tables are SMALL, who cares, it runs fast any which way you do it.
select *
from
t1, t2 where t1.x = t2.x (+) and t2.x is null
call count cpu elapsed disk query current rows -------- ------- -------- --------- -------- -------- ------- ----------
Parse 1 0.03 0.03 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.01 0.02 0 82 3 10 -------- ------- -------- --------- -------- -------- ------- ---------- total 3 0.04 0.05 0 82 3 10
Misses in library cache during parse: 1
Optimizer hint: CHOOSE
Parsing user id: 144 (TKYTE)
Rows Execution Plan
------- --------------------------------------------------- 0 SELECT STATEMENT OPTIMIZER HINT: CHOOSE 27 FILTER 17 NESTED LOOPS (OUTER) 27 TABLE ACCESS (FULL) OF 'T1' 27 INDEX (UNIQUE SCAN) OF 'SYS_C003174' (UNIQUE) ****************************************************************************** ******************************************************************************
select *
from
t1 where x not in ( select x from t2 )
call count cpu elapsed disk query current rows -------- ------- -------- --------- -------- -------- ------- ----------
Parse 1 0.01 0.01 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.78 0.78 0 214 84 10 -------- ------- -------- --------- -------- -------- ------- ---------- total 3 0.79 0.79 0 214 84 10
Misses in library cache during parse: 0
Optimizer hint: CHOOSE
Parsing user id: 144 (TKYTE)
Rows Execution Plan
------- --------------------------------------------------- 0 SELECT STATEMENT OPTIMIZER HINT: CHOOSE 27 FILTER 27 TABLE ACCESS (FULL) OF 'T1' 23911 TABLE ACCESS (FULL) OF 'T2' ****************************************************************************** ******************************************************************************
select *
from
t1 where not exists ( select null from t2 where t2.x = t1.x )
call count cpu elapsed disk query current rows -------- ------- -------- --------- -------- -------- ------- ----------
Parse 1 0.01 0.07 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.02 0.09 0 82 3 10 -------- ------- -------- --------- -------- -------- ------- ---------- total 3 0.03 0.16 0 82 3 10
Misses in library cache during parse: 0
Optimizer hint: CHOOSE
Parsing user id: 144 (TKYTE)
Rows Execution Plan
------- --------------------------------------------------- 0 SELECT STATEMENT OPTIMIZER HINT: CHOOSE 27 FILTER 27 TABLE ACCESS (FULL) OF 'T1' 27 INDEX (UNIQUE SCAN) OF 'SYS_C003174' (UNIQUE) ******************************************************************************
Thomas Kyte
tkyte_at_us.oracle.com
Oracle Government
Received on Fri Jul 28 1995 - 00:00:00 CEST