Re: Outer Join - doing it first

From: Alireza Assadzadeh <aassadza_at_chat.carleton.ca>
Date: 1996/09/18
Message-ID: <3240C0C3.1094_at_chat.carleton.ca>


Ian Goddard wrote:
>
> If I understand you correctly what you want is
> SELECT id, name FROM a WHERE id NOT IN (
> SELECT id FROM b)
>
> The sub-SELECT gives all b.id and the main SELECT gives
> all the a.id whose values are not in the set from b.
>
> Alireza Assadzadeh wrote:
> >
> > Hello,
> >
> > I am trying to do an outer join between two tables with an
> > extra where clause condition on the subserviant (outer join) table.
> > This results in where clause to be applied to the subserviant table
> > BEFORE the joining. I want the outer join to be done first and then
> > the where cluase to be applied to the result. Is there a better way
> > to do this other than creating a view.
> >
> > Please consider the following example:
> > -----------------------
> > Table a :
> > id integer,
> > name char
> > -----------------------
> > Table b:
> > id integer.
> > account char
> > -----------------------
> >
> > What is needed:
> >
> > -----------------------
> > select
> > a.id as a_id,
> > a.name as a_name,
> > b.id as b_id,
> > b.account as b_account
> > from
> > a, outer b
> > where a.id = b.id
> > and b.id is NULL
> > -----------------------
> >
> > This should give all the rows in table a that don't exist in table b.
> > But it doesn't work because b.id is NULL condition is applied to b first
> > and then the outer join is done.
> >
>
> You save a selection term, b.id IS NULL and a join term a.id = b.id. The
> optimiser will take the selection term first. ...

I think you got my point right here. I was hopping to make the optimiser to take the join term, a.id = b.id, FIRST. Then the optimizer can take the selection term, b.id is null. I managed to force this order by using the VIEW (please see bellow).

> ... This will return you a set
> of results with b.id as NULL; that's what you asked for. The join clause
> will then return nothing. Any attempt to compare a value - even NULL - to
> NULL will return FALSE. There was a long running discussion about this
> not long ago. As a matter of interest, if you did do the join first this
> same rule about NULLs would ensure that the returned set contains no
> instances where the id is NULL. The selection term would then eliminate
> all of that set and return you nothing.
>
> At the bottom of your problem is a failure to understand the nature of
> the NULL returned from an outer join versus the NULL in the selection term.
> If you had:
> select
> a.id as a_id,
> a.name as a_name,
> b.id as b_id,
> b.account as b_account
> from
> a, outer b
> where a.id = b.id
>
> this might return some rows with b_id and b_account set to NULL. This
> comes about from rows in a with no counterpart in b as you say. You
> should think of these NULLs as padding inserted AFTER the reading of the
> database. The NULL in the WHERE section refers to b.id, the data in
> the database BEFORE you read it. I hope that by my comparing the alias
> (b_id) with the table.column form, b.id, you will understand this more
> clearly.
>
> Finally, why include b.id and b.account if the query is to return only
> those rows where they don't exist? They will always be NULL!

I included them for the purpose of confirmation, so that when the query is run it could be observed that IN FACT the information selected from table a does not have NULLS but ALL the information received from table b has NULLS

>
> Regards
>
> Ian

Hello,

A little while ago I was doing some work on tunning SQL using Oracle(7.1.6).

The following three three queries give the same record results in oracle:

QUERY I



select

   a.id as a_id,
   a.name as a_name
from a
where
a.id not in (select b.id from b)

QUERY II



select

   a.id as a_id,
   a.name as a_name
from a
where
not exists (select 'X' from b where b.id = a.id)

QUERY III(a)



select

   a.id as a_id,
   a.name as a_name
from a,b
where
a.id=b.id(+)
and b.id is NULL

Generally, for all the table types that I tried, Query I was 'much faster' than Query II which was 'much faster' than Query III(a). Basically, because oracle RDBMS kernel handles these differently. I won't get into details.

I tried doing the same thing in Informix. I had to change Query III(a) to Query III(b) as follows:

QUERY III(b)



select

   a.id as a_id,
   a.name as a_name
from a,outer b
where
a.id=b.id
and b.id is NULL

As you mentioned Query III(b) does not result in what I expected becuase of the order the conditions in the where clause and the joining are applied.

Therefore, Query III(a) in Oracle and Query III(b) in Informix, although they look the same, are handled differenlty and are NOT EQUAL.

Furthermore, I can use the view in the manner I suggested to cause the same effect.
i.e.
create view work_around as
select

   a.id   as a_id,
   a.name as a_name,
   b.id   as b_id

from a,outer b
where
a.id=b.id;
-- Then do the second query for the final result select a_id, a_name from work_around where b_id is NULL

I hope this makes a little more sense now.

Thanks for all your help.

Alireza Assadzadeh Received on Wed Sep 18 1996 - 00:00:00 CEST

Original text of this message