Re: Which column is taken in JOIN with multiple columns and different content ?

Home -> Community -> Usenet -> c.d.o.misc -> Re: Which column is taken in JOIN with multiple columns and different content ?

From: Kevin Kirkpatrick <kvnkrkptrck_at_gmail.com>
Date: 2 Mar 2007 09:42:59 -0800
Message-ID: <1172857379.838228.326230@64g2000cwx.googlegroups.com>

Disclaimer: if this is in a context outside of ad hoc "data scanning" or "table dumping", you should avoid the * operator altogether...

That said...

On Mar 2, 10:07 am, mer..._at_gmx.net (Tobias Merler) wrote:
> Assume I join two tables.
> Both of these tables contain a column with name "col1". This column ist not he primary key.
> However this column contains different values which value is taken ?
>
> Lets see the following example:
>
> Select *
> FROM tablea a, tableb b
> Where a.id = b.id
>
> tablea:
>
> id col1 col2
>
> 1 aaa ddd
> 2 bbb eee
> 3 ccc fff
>
> tableb:
>
> id col1 col3
>
> 1 zzz uuu
> 2 xxx vvv
> 3 yyy www
>
> The result table will look like:
>
> id col1 col2 col3
>
> 1 ??? ddd uuu
> 2 ??? eee vvv
> 3 ??? fff www
>
> Which values will be for the questions marks ?
>
> Ok, I know the data model is not perfect and the two tables could be merged.
> But keep in mind that the sample is only a simplified version of a more complex
> SQL statement. So lets concentrate on my question.
>
> Thank you
>
> Toby

Actually, your query will return

id col1 col2 id col1 col3

1 ??? ddd 1 ??? uuu
2 ??? eee 2 ??? vvv
3 ??? fff 3 ??? www

I'm not entirely sure if Oracle has any high-level rules (e.g. the order tables are listed) that disambiguate whether the columns from tablea will go first or tableb... but based on the results, you can look at the surrounding columns to infer which table the col1 "???" values will match. However, this is tedious and error-prone; plus, Oracle won't let you refer to id or col1 of this result without producing an error.

Best practice is to eliminate the ambiguity altogether... A few approaches:

Specify all of your columns from one table, except for the join columns, aliasing all duplicate column names: SELECT a.col1 acol1, a.col2,..., b.* FROM TABLEA a, TABLEB b WHERE A.ID=B.ID;

The downside here is that you want to say, "give me all the columns in the join of A and B", not, "give me a.col1, a.col2, a.col3.... from A and all columns from B from the join of A and B". Also, something about omitting a.id and including b.id for the id has a less-than- elegant, arbitrary feel...

2) Alias columns in sub-query
SELECT * FROM
(SELECT id, col1 acol1, col2 ... FROM TABLEA ) a, (SELECT * FROM TABLEB ) b,
WHERE A.ID=B.ID;
Downsides aplenty: we're still naming all columns, just at a lower level. Plus, id will be in the result set twice, once a.id, once as b.id, and so couldn't be referenced by a calling query.

3) Continuing with the #2 approach, if you alias all non-joining columns between tablea and tableb to different names and, if necessary, all joining columns to the same name, then you can write this as follows:
SELECT * FROM
(SELECT id, col1 acol1, col2 ... FROM TABLEA ) a join
(SELECT * FROM TABLEB) b
using (id);

Downsides: #3 fixes the double-id problem, but is still verbose (again, imagine if tablea and tableb had 100 columns!).

4) Unfortunately, Oracle does not (to my knowledge) support a RENAME operator over result sets, where something like this might be possible:
SELECT * FROM
((SELECT * FROM TABLEA ) RENAME col1 AS acol1) a join
((SELECT * FROM TABLEB ) RENAME col1 AS bcol1) b using (id);

In this case, only the column names shared or joined by tablea and tableb need to be mentioned explictly in the query... (in other words, this query would stay about this short regardless of how many columns were in tablea and tableb).

Anyway, lacking a RENAME operator, I'd probably lean towards #1 in this example for "quick and dirty" ad hoc querying. Received on Fri Mar 02 2007 - 11:42:59 CST