Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: Re: anyone have success with semi-joins?

Re: Re: anyone have success with semi-joins?

From: <ryan_oracle_at_cox.net>
Date: Mon, 03 Nov 2003 10:19:24 -0800
Message-ID: <F001.005D575F.20031103101924@fatcity.com>


if thats the case, how is a semi_nl useful? its doing a nested loop also.

There is alot of academic talk on semi-joins on the web, but I cant find an algorithm anywhere. The basics of it appear to be generic.
>
> From: "Tanel Poder" <tanel.poder.003_at_mail.ee>
> Date: 2003/11/03 Mon PM 12:49:25 EST
> To: Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com>
> Subject: Re: anyone have success with semi-joins?
>
> Hi!
>
> I don't know the exact algorithm either, but check the execution plans:
>
> SQL> set autot trace exp
> SQL> select name from obj$
> 2 where exists (select 1 from tab$ where obj$.obj# = tab$.obj#);
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 FILTER
> 2 1 TABLE ACCESS (FULL) OF 'OBJ$'
> 3 1 TABLE ACCESS (CLUSTER) OF 'TAB$'
> 4 3 INDEX (UNIQUE SCAN) OF 'I_OBJ#' (NON-UNIQUE)
>
>
>
> SQL> select /*+ HASH_SJ */ name from obj$
> 2 where exists (select 1 from tab$ where obj$.obj# = tab$.obj#);
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=8168 Bytes=3
> 51224)
>
> 1 0 HASH JOIN (SEMI) (Cost=50 Card=8168 Bytes=351224)
> 2 1 TABLE ACCESS (FULL) OF 'OBJ$' (Cost=25 Card=8168 Bytes=2
> 45040)
>
> 3 1 VIEW OF 'VW_SQ_1' (Cost=25 Card=8168 Bytes=106184)
> 4 3 TABLE ACCESS (FULL) OF 'TAB$' (Cost=25 Card=8168 Bytes
> =106184)
>
> You see in first execution plan we use FILTER for getting our results. FILTER has several meanings, but in this case it could be bounded nested loop (e.g. stopping for particular driving key when exists condition is satisfied). That means, for every row in obj$ we have to check separately whether there is any matching rows, this means lots of logical IOs.
>
> In second execution plan however, a hash join is done between obj$ and distinct obj# key values from tab$ (that's the reason why we're calling it a semi-join). Semi-join allow us to replace big amount of small nested loop IOs with few big reads and a hash join instead.
>
> Note that I don't know whether Oracle actually behaves this way, I'm just telling you how I think it is done, based only on brief analysis...
>
> Cheers,
> Tanel.
>
> ----- Original Message -----
> To: "Multiple recipients of list ORACLE-L" <ORACLE-L_at_fatcity.com>
> Sent: Monday, November 03, 2003 5:04 PM
>
>
> > I picked up Mark Gurry's SQL pocket reference on Friday and it states that you can radically improve performance of 'where exists' statements by using semi-joins since it only returns the sub-query ones.
> >
> > Now logically you have to return the sub-query repeatedly, this would only come into play if there are duplicates right?
> >
> > anyone know the algorithm for the semi-join? Not just the result? I cant find it anywhere.
> >
> > anyone ever hint this or let oracle decide?
> >
> > --
> > Please see the official ORACLE-L FAQ: http://www.orafaq.net
> > --
> > Author: <ryan_oracle_at_cox.net
> > INET: ryan_oracle_at_cox.net
> >
> > Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> > San Diego, California -- Mailing list and web hosting services
> > ---------------------------------------------------------------------
> > To REMOVE yourself from this mailing list, send an E-Mail message
> > to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> > the message BODY, include a line containing: UNSUB ORACLE-L
> > (or the name of mailing list you want to be removed from). You may
> > also send the HELP command for other information (like subscribing).
> >
>









Hi!
 
I don't know the exact algorithm either, but check the execution plans:
 
SQL> set autot trace exp
SQL> select name from obj$
  2  where exists (select 1 from tab$ where obj$.obj# = tab$.obj#);
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   FILTER
   2    1     TABLE ACCESS (FULL) OF 'OBJ$'
   3    1     TABLE ACCESS (CLUSTER) OF 'TAB$'
   4    3       INDEX (UNIQUE SCAN) OF 'I_OBJ#' (NON-UNIQUE)
 
 
 
SQL> select /*+ HASH_SJ */ name from obj$
  2  where exists (select 1 from tab$ where obj$.obj# = tab$.obj#);
 
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=8168 Bytes=3
          51224)
 
   1    0   HASH JOIN (SEMI) (Cost=50 Card=8168 Bytes=351224)
   2    1     TABLE ACCESS (FULL) OF 'OBJ$' (Cost=25 Card=8168 Bytes=2
          45040)
 
   3    1     VIEW OF 'VW_SQ_1' (Cost=25 Card=8168 Bytes=106184)
   4    3       TABLE ACCESS (FULL) OF 'TAB$' (Cost=25 Card=8168 Bytes
          =106184)
 
You see in first execution plan we use FILTER for getting our results. FILTER has several meanings, but in this case it could be bounded nested loop (e.g. stopping for particular driving key when exists condition is satisfied). That means, for every row in obj$ we have to check separately whether there is any matching rows, this means lots of logical IOs.
 
In second execution plan however, a hash join is done between obj$ and distinct obj# key values from tab$ (that's the reason why we're calling it a semi-join). Semi-join allow us to replace big amount of small nested loop IOs with few big reads and a hash join instead.
 
Note that I don't know whether Oracle actually behaves this way, I'm just telling you how I think it is done, based only on brief analysis...
 
Cheers,
Tanel.
 
----- Original Message -----
From: <ryan_oracle@cox.net>
To: "Multiple recipients of list ORACLE-L" <ORACLE-L@fatcity.com>
Sent: Monday, November 03, 2003 5:04 PM
Subject: anyone have success with semi-joins?

> I picked up Mark Gurry's SQL pocket reference on Friday and it states that you can radically improve performance of 'where exists' statements by using semi-joins since it only returns the sub-query ones.
>
> Now logically you have to return the sub-query repeatedly, this would only come into play if there are duplicates right?
>
> anyone know the algorithm for the semi-join? Not just the result? I cant find it anywhere.
>
> anyone ever hint this or let oracle decide?
>
> --
> Please see the official ORACLE-L FAQ:
http://www.orafaq.net
> --
> Author: <
ryan_oracle@cox.net
>   INET: ryan_oracle@cox.net
>
> Fat City Network Services    -- 858-538-5051
http://www.fatcity.com
> San Diego, California        -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to:
ListGuru@fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from).  You may
> also send the HELP command for other information (like subscribing).
>
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: <ryan_oracle_at_cox.net
  INET: ryan_oracle_at_cox.net

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Mon Nov 03 2003 - 12:19:24 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US