Re: Oracle Left Outer Join problem - first occurance of a match

From: ddf <oratune_at_msn.com>
Date: Wed, 29 Oct 2008 08:52:47 -0700 (PDT)
Message-ID: <6e6f311e-7230-41de-b704-1cd6eecfabfb@17g2000hsk.googlegroups.com>


On Oct 29, 10:47 am, ddf <orat..._at_msn.com> wrote:
> On Oct 28, 6:52 pm, trp..._at_gmail.com wrote:
>
>
>
>
>
> > On Oct 28, 2:43 pm, ddf <orat..._at_msn.com> wrote:
>
> > > On Oct 28, 12:24 pm, trp..._at_gmail.com wrote:
>
> > > > I am trying to left outer join to a table with mismatched results...
> > > > let me explain
>
> > > > Table: first_names
> > > > id     first
> > > > 1      Tom
> > > > 2      Mike
> > > > 3      Jack
>
> > > > Table: last_names
> > > > id      last
> > > > 1      Jones
> > > > 2      Smith
> > > > 2      Smiths
>
> > > > When I query the tables here is what I get
>
> > > > SELECT a.id, a.first, b.last FROM first_names a, last_names b
> > > > WHERE a.id=b.id(+);
>
> > > > Here is what I get
>
> > > > id      first      last
> > > > 1      Tom     Jones
> > > > 2      Mike    Smith
> > > > 2      Mike    Smiths
> > > > 3      Jack
>
> > > > What I really want to get is 3 results:
> > > > id      first      last
> > > > 1      Tom     Jones
> > > > 2      Mike    Smith
> > > > 3      Jack
>
> > > > I just want to join to the first occurance of the id in the second
> > > > table, no matter what the result is.
>
> > > > I have simplified this query a good deal to show the problem I am
> > > > having, so the data set should not be taken literally as I already see
> > > > there will be questions as to why there would be duplicate id's with
> > > > different last names, but without going into great detail on the
> > > > actual query this is necessary for my circumstances.
>
> > > > So is there a way to do a join and only pick up the first occurance of
> > > > a match on the second table?
>
> > > > Thanks!!
>
> > > Possibly this will work:
>
> > > SQL> create table first_names(
> > >   2          id      number,
> > >   3          first   varchar2(20)
> > >   4  );
>
> > > Table created.
>
> > > SQL>
> > > SQL> create table last_names(
> > >   2          id      number,
> > >   3          last    varchar2(30)
> > >   4  );
>
> > > Table created.
>
> > > SQL>
> > > SQL> insert all
> > >   2  into first_names
> > >   3  values (1, 'Tom')
> > >   4  into first_names
> > >   5  values (2, 'Mike')
> > >   6  into first_names
> > >   7  values (3, 'Jack')
> > >   8  into last_names
> > >   9  values (1, 'Jones')
> > >  10  into last_names
> > >  11  values (2, 'Smith')
> > >  12  into last_names
> > >  13  values (2, 'Smiths')
> > >  14  select * From dual;
>
> > > 6 rows created.
>
> > > SQL>
> > > SQL> commit;
>
> > > Commit complete.
>
> > > SQL>
> > > SQL> select f.id, f.first, l.last
> > >   2  from
> > >   3  first_names f left outer join
> > >   4  (select id, last
> > >   5          from
> > >   6          (select id, last, dense_rank() over (order by id, last)
> > > rk
> > >   7                  from last_names)
> > >   8          where rk = id) l
> > >   9  on l.id = f.id
> > >  10  /
>
> > >         ID FIRST                LAST
> > > ---------- -------------------- ------------------------------
> > >          1 Tom                  Jones
> > >          2 Mike                 Smith
> > >          3 Jack
>
> > > SQL>
>
> > > David Fitzjarrell- Hide quoted text -
>
> > > - Show quoted text -
>
> > Thanks! This is close, however the id's actually are non linear and
> > look something like 1001922 so rank doesn't work so well
>
> > So I guess to make my example better reflect what I am doing, the
> > tables should look more like this
>
> > SQL> insert all
>
> > >   2  into first_names
> > >   3  values (1001922, 'Tom')
> > >   4  into first_names
> > >   5  values (1001957, 'Mike')
> > >   6  into first_names
> > >   7  values (1001988, 'Jack')
> > >   8  into last_names
> > >   9  values (1001922, 'Jones')
> > >  10  into last_names
> > >  11  values (1001957, 'Smith')
> > >  12  into last_names
> > >  13  values (1001957, 'Smiths')
> > >  14  select * From dual;
>
> > Table: first_names
> > id                first
> > 1001922      Tom
> > 1001957      Mike
> > 1001988      Jack
>
> > Table: last_names
> > id                last
> > 1001922      Jones
> > 1001957      Smith
> > 1001957      Smiths- Hide quoted text -
>
> > - Show quoted text -
>
> Using RANK() in place of DENSE_RANK() and using only the ID I get this
> with your 'new' data:
>
> SQL> select f.id, f.first, l.last
>   2  from
>   3  first_names f left outer join
>   4  (select id, last
>   5          from
>   6          (select id, last, rank() over (order by id) rk
>   7                  from last_names)
>   8          where rk = rownum) l
>   9  on l.id = f.id
>  10  /
>
>         ID FIRST                LAST
> ---------- -------------------- ------------------------------
>    1001922 Tom                  Jones
>    1001957 Mike                 Smiths
>    1001988 Jack
>
> SQL>
>
> David Fitzjarrell- Hide quoted text -
>
> - Show quoted text -

Of couse that probably won't work once you get past the first set of duplicates; providing such a limited set of test data can create 'solutions' which work for that set alone, and won't solve the issue in an extended set of data:

SQL> select f.id, f.first, l.last
  2 from
  3 first_names f left outer join
  4 (select id, last

  5          from
  6          (select id, last, rank() over (order by id) rk
  7                  from last_names)
  8          where rk = rownum) l

  9 on l.id = f.id
 10 /
        ID FIRST                LAST
---------- -------------------- ------------------------------
   1001922 Tom                  Jones

   1001957 Mike
   1001988 Jack

SQL> You might be better served with a PL/SQL implementation, either through an anonymous block or via a stored procedure.

David Fitzjarrell Received on Wed Oct 29 2008 - 10:52:47 CDT

Original text of this message