Re: SQL Puzzler - Join T and DISTINCT(T)

From: Matthew Merzbacher <matthew_at_pram.cs.ucla.edu>
Date: 4 Mar 92 17:38:23 GMT
Message-ID: <matthew.699730703_at_pram.cs.ucla.edu>


matthew_at_pram.cs.ucla.edu (Me) writes:

>I have a table, let's call it 'T', with two attributes (A & B), having the
>following values:
 

>A B
>---------
>1 x
>1 x
>1 x
>1 y
>1 y
>[ rest of table has other values for A - deleted as not germaine ]
 

>I want to join this table with its DISTINCT self on A, also ignoring cases
>where the DISTINCT B and the regular B column are identical.
>That is, I want to get:
 

>B B
>---------
>x y
>x y
>y x
>y x
>y x
 

>To recap:
>Let R = SELECT DISTINCT * from T
 

>Now, I want:
> SELECT R.B T.B
> WHERE R.A = T.A AND R.B != T.B
 
>However, I'd like to do these two steps in a *single* (presumably nested) SQL
>statement. It sounds easy, but it's turning into a nightmare.
 

>I can (and have) done it using ORACLE PL/SQL, by the way.
 

>Questions:
> 1. Is there a general single SQL statement that will join a table with
> it's distinct version?
>
> 2. Is there an ORACLE SQL*PLUS query to do the same thing (note that
> ORACLE != SQL)?
I've gotten several answers to my query. First, I'd like to thank the following four Oracle employees (and some others, whose names I forgot to save), each of whom provided a solution to my problem...

  Nate Calvin <ncalvin_at_us.oracle.com>   Jon Galley <jgalley_at_us.oracle.com>
  Chris van Es <cvanes_at_uk.oracle.com>   Keith Woolner <kwoolner_at_us.oracle.com>

By the way, none of these men speaks officially for Oracle, at least not to me.

From the mail, I have a couple of choices. The first choice is to create a view which contains the distinct table and then join on the view.

CREATE VIEW R AS
SELECT DISTINCT * FROM T
/

SELECT R.B, T.B
  FROM R, T
 WHERE R.A = T.A
   AND R.B != T.B
/

This solution has one advantage and one problem. The advantage is that it should work (perhaps with a little syntactic modification) on any SQL system. The problem is that it has a high overhead cost. Creating views is not cheap, plus, there are two separate statements - causing extra overhead when communicating with the server.

The second solution uses two features of Oracle that does not exist in all other DB systems. In Oracle, every row in every relation has a unique "rowid". Further, the rows returned by a select have are numbered sequentially by "rownum". We can then use this to effect a "DISTINCT" inner selection. That is:

select r.b, s.b
  from t s, t r
  where r.b != s.b
  and r.a = s.a
  and r.rowid = (
    select rowid
    from t x
    where r.a = x.a and r.b = x.b and rownum = 1     );

The inner select gets only the first row (thus DISTINCT) from the table and joins it with all other rows.

You actually don't really need "rownum". Instead, you can select min(rowid) rather than rowid in the inner-loop, but that's a bit slower.

Anyhow - thanks for the help. I still wonder if the problem can be solved without some SQL mix-ins like rowid. Nobody here (at UCLA) seems to think so, but proving that statement is another horse.

--

Matthew Merzbacher	ARPA:	matthew_at_CS.UCLA.EDU
Moo - Moo Moo  		UUCP:	...!{uunet|rutgers|ucbvax}!cs.ucla.edu!matthew
               		This time, *with* the oys
Received on Wed Mar 04 1992 - 18:38:23 CET

Original text of this message