Re: Need oracle expert - SQL

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Sun, 2 Mar 2008 19:37:35 -0800 (PST)
Message-ID: <3edf9a9d-2ac7-4ea8-a738-64cdd5494653@n36g2000hse.googlegroups.com>


On Mar 2, 4:18 pm, marfi95 <marf..._at_yahoo.com> wrote:
> All,
>
> I am in need of some expert oracle advice on how to solve the
> following problem.  Any ideas would be most helpful.
>
> What I need to do is find duplicate rows in 2 joined tables (just
> certain columns are checked for dup, not all) and
> return information from both tables on a single row.  However, my
> requirement is not just to report the duplicate data, but to report
> other data from the two tables as well.  The problem I run into is
> that I need to return the data on a single row back to my application
> in a ref cursor, but table 2 can have multiple rows per table 1 based
> on the join.  So, the original record could have more rows (because of
> table 2) than the duplicate or the other way.  I need to report both
> back to the user.
>
> Here is an example of the data, what is being used in the dup check
> and how I need the data returned to my application.  I'm open to any
> ideas, creating temp tables on the fly, etc...  One way I thought of
> was selecting part of the duplicate data into a temp table, then
> joining that with the other info and pivoting it.  I'm hoping there
> are simpler ways to do this that I'm not thinking of.  I'm not an
> Oracle expert.  The other thing I've done is created a view that
> contains the join of table 1 and 2 already.
>
> Hopefully, this will make sense.
>
> The selection criteria is t1col1, t1col2 from table 1 have to match
> another row from table 1, but only 1 value (not all) from table2 has
> to match any value from table 2. It doesn't matter what order they are
> either as illustrated in the example.  I, however, need to report all
> rows from table 2, even though only 1 matches. Crazy requirements, but
> thats our user.
>
> Remember table 1 and 2 are joined before the dup check.  The base
> criteria will only use 1 and 2 as the original based on some other
> data, so that is why 3 and 4 will not show as original records in the
> example.
>
> Table1:
> ==========
>
> T1Join1   T1Col1   T1Col2   T1Col3
> -------       ----------   ----------    -----------
> 1            ABC      123         Test
> 2            DEF      456         Test2
> 3            ABC      123         Test3
> 4            DEF      456         Test4
>
> Table2:
> ===========
>
> T2Join1 T2Col1   T2Col2
> --------    ---------    -------
> 1          Fred       XYZ
> 1          Charlie   YYY
> 2          Martha   ABC
> 2          Jane      ABC
> 3          Fred      CCC
> 3          Joan      DDD
> 3          Jack      EEE
> 3          Rob       EEE
> 4          Jane      FFF
>
> Result Needed:
> ===============
>
> T1Join1  T1Col1  T1Col2  T1Col3  T2Col1  T2Col2  T1Join1  T1Col1
> T1Col2 T1Col3     T2Col1      T2Col2
> ----------   ------      ------      ------      -----
> ------      ------      ------        ------     ------
> ------          ------
> 1           ABC     123       Test      Fred      XYZ      3
> ABC       123       Test3        Fred        CCC
> 1           ABC     123       Test      Charlie   YYY     3
> ABC        123      Test3        Joan         DDD
> 1           ABC     123       Test      NULL     NULL    3
> ABC        123       Test3        Jack         EEE
> 1           ABC     123       Test      NULL     NULL    3
> ABC        123       Test3        Rob          EEE
> 2           DEF     456       Test2     Martha  ABC      4
> DEF        456       Test4        Jane         FFF
> 2           DEF     456       Test2     Jane      ABC     4
> DEF        456       Test4        NULL        NULL
>
> Hopefully this makes sense and people will take this as a challenge !
>
> Thanks !
> Marc

This looks complicated. It would have been helpful if you had included the DDL and DML:
CREATE TABLE T1(

  T1JOIN1 NUMBER(10),
  T1COL1 VARCHAR2(5),
  T1COL2 VARCHAR2(5),
  T1COL3 VARCHAR2(5));

T1Join1 T1Col1 T1Col2 T1Col3
------- -------- ------- -----------

INSERT INTO T1 VALUES (1,'ABC','123','Test');
INSERT INTO T1 VALUES (2,'DEF','456','Test2');
INSERT INTO T1 VALUES (3,'ABC','123','Test3');
INSERT INTO T1 VALUES (4,'DEF','456','Test4');

CREATE TABLE T2(

  T1JOIN1 NUMBER(10),
  T2COL1 VARCHAR2(7),
  T2COL2 VARCHAR2(5));

INSERT INTO T2 VALUES (1,'Fred','XYZ');
INSERT INTO T2 VALUES (1,'Charlie','YYY');
INSERT INTO T2 VALUES (2,'Martha','ABC');
INSERT INTO T2 VALUES (2,'Jane','ABC');
INSERT INTO T2 VALUES (3,'Fred','CCC');
INSERT INTO T2 VALUES (3,'Joan','DDD');
INSERT INTO T2 VALUES (3,'Jack','EEE');

INSERT INTO T2 VALUES (3,'Rob','EEE');
INSERT INTO T2 VALUES (4,'Jane','FFF');

COMMIT; First, a test to see if we can fix up table T1 so that the rows that are related by T1COL are joined together. We will reference the T1 table twice, giving it an alias for the second reference: SELECT

  T1.T1JOIN1,
  T1.T1COL1,
  T1.T1COL2,
  T1.T1COL3,
  T11.T1JOIN1 T11JOIN1,
  T11.T1COL1 T11COL1,

  T11.T1COL2 T11COL2,
  T11.T1COL3 T11COL3
FROM
  T1,
  T1 T11
WHERE
  T1.T1COL2=T11.T1COL2
  AND T1.T1JOIN1<T11.T1JOIN1;

   T1JOIN1 T1COL T1COL T1COL T11JOIN1 T11CO T11CO T11CO ---------- ----- ----- ----- ---------- ----- ----- -----

         1 ABC   123   Test           3 ABC   123   Test3
         2 DEF   456   Test2          4 DEF   456   Test4

Now, let's look at table T2. It would be helpful if there were a counter that counts up from 1 for each of the rows with the same value for T1JOIN1, and I elected to order the rows by T2COL1: SELECT

  T2.T1JOIN1,
  T2.T2COL1,
  T2.T2COL2,

  ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM
  T2;

   T1JOIN1 T2COL1 T2COL RN
---------- ------- ----- --

         1 Charlie YYY    1
         1 Fred    XYZ    2
         2 Jane    ABC    1
         2 Martha  ABC    2
         3 Fred    CCC    1
         3 Jack    EEE    2
         3 Joan    DDD    3
         3 Rob     EEE    4
         4 Jane    FFF    1

Now we have a bit of a problem in that we need to join these two SELECT statements together twice with different join conditions, and then outer join the results. The first of the two joins looks like this:
  SELECT

    T1.T1JOIN1,
    T1.T1COL1,
    T1.T1COL2,
    T1.T1COL3,
    T1.T11JOIN1,
    T1.T11COL1,
    T1.T11COL2,
    T1.T11COL3,
    T2.T2COL1,
    T2.T2COL2,
    T2.RN

  FROM
    (SELECT
      T1.T1JOIN1,
      T1.T1COL1,
      T1.T1COL2,
      T1.T1COL3,
      T11.T1JOIN1 T11JOIN1,
      T11.T1COL1 T11COL1,
      T11.T1COL2 T11COL2,
      T11.T1COL3 T11COL3
    FROM
      T1,
      T1 T11
    WHERE
      T1.T1COL2=T11.T1COL2
      AND T1.T1JOIN1<T11.T1JOIN1) T1,
    (SELECT
      T2.T1JOIN1,
      T2.T2COL1,
      T2.T2COL2,
      ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
    FROM
      T2) T2

  WHERE
    T1.T1JOIN1=T2.T1JOIN1; T1JOIN1 T1COL T1COL T1COL T11JOIN1 T11CO T11CO T11CO T2COL1 T2COL RN
------- ----- ----- ----- ---------- ----- ----- ----- ------- -----
--
      1 ABC   123   Test           3 ABC   123   Test3 Fred    XYZ
1
      1 ABC   123   Test           3 ABC   123   Test3 Charlie YYY
2
      2 DEF   456   Test2          4 DEF   456   Test4 Martha  ABC
1
      2 DEF   456   Test2          4 DEF   456   Test4 Jane    ABC
2

The second time we join the two SELECT statements together, we will
join with T11.T1JOIN1=T2.T1JOIN1.  We will then full outer join the
results such that T1COL1 from the one SELECT is joined with T11COL1
from the second SELECT and RN from the first SELECT is joined to RN
from the second SELECT:
SELECT
  V1.T1JOIN1,
  V1.T1COL1,
  V1.T1COL2,
  V1.T1COL3,
  V1.T2COL1,
  V1.T2COL2,
  V2.T11JOIN1,
  V2.T11COL1,
  V2.T11COL2,
  V2.T11COL3,
  V2.T2COL1,
  V2.T2COL2
FROM
  (SELECT
    T1.T1JOIN1,
    T1.T1COL1,
    T1.T1COL2,
    T1.T1COL3,
    T2.T1JOIN1 T2_T1JOIN1,
    T2.T2COL1,
    T2.T2COL2,
    T2.RN
  FROM
    (SELECT
      T1.T1JOIN1,
      T1.T1COL1,
      T1.T1COL2,
      T1.T1COL3,
      T11.T1JOIN1 T11JOIN1,
      T11.T1COL1 T11COL1,
      T11.T1COL2 T11COL2,
      T11.T1COL3 T11COL3
    FROM
      T1,
      T1 T11
    WHERE
      T1.T1COL2=T11.T1COL2
      AND T1.T1JOIN1<T11.T1JOIN1) T1,
    (SELECT
      T2.T1JOIN1,
      T2.T2COL1,
      T2.T2COL2,
      ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
    FROM
      T2) T2
  WHERE
    T1.T1JOIN1=T2.T1JOIN1) V1
FULL OUTER JOIN
  (SELECT
    T1.T11JOIN1,
    T1.T11COL1,
    T1.T11COL2,
    T1.T11COL3,
    T2.T1JOIN1,
    T2.T2COL1,
    T2.T2COL2,
    T2.RN
  FROM
    (SELECT
      T1.T1JOIN1,
      T1.T1COL1,
      T1.T1COL2,
      T1.T1COL3,
      T11.T1JOIN1 T11JOIN1,
      T11.T1COL1 T11COL1,
      T11.T1COL2 T11COL2,
      T11.T1COL3 T11COL3
    FROM
      T1,
      T1 T11
    WHERE
      T1.T1COL2=T11.T1COL2
      AND T1.T1JOIN1<T11.T1JOIN1) T1,
    (SELECT
      T2.T1JOIN1,
      T2.T2COL1,
      T2.T2COL2,
      ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
    FROM
      T2) T2
  WHERE
    T1.T11JOIN1=T2.T1JOIN1) V2
ON
  V1.T1COL1=V2.T11COL1
  AND V1.RN=V2.RN;

   T1JOIN1 T1COL T1COL T1COL T2COL1  T2COL   T11JOIN1 T11CO T11CO
T11CO T2COL1  T2COL
---------- ----- ----- ----- ------- ----- ---------- ----- -----
----- ------- -----
         1 ABC   123   Test  Charlie YYY            3 ABC   123
Test3 Fred    CCC
         1 ABC   123   Test  Fred    XYZ            3 ABC   123
Test3 Jack    EEE
                                                    3 ABC   123
Test3 Joan    DDD
                                                    3 ABC   123
Test3 Rob     EEE
         2 DEF   456   Test2 Jane    ABC            4 DEF   456
Test4 Jane    FFF
         2 DEF   456   Test2 Martha  ABC

But, we have a problem.  You specified that the first three columns
could not be NULL, nor could columns 7-9.  Reworking the SQL statement
using NVL:
SELECT
  NVL(V1.T1JOIN1,V2.T1JOIN1) T1JOIN1,
  NVL(V1.T1COL1,V2.T1COL1) T1COL1,
  NVL(V1.T1COL2,V2.T1COL2) T1COL2,
  NVL(V1.T1COL3,V2.T1COL3) T1COL3,
  V1.T2COL1,
  V1.T2COL2,
  NVL(V2.T11JOIN1,V1.T11JOIN1) T11JOIN1,
  NVL(V2.T11COL1,V1.T11COL1) T11COL1,
  NVL(V2.T11COL2,V1.T11COL2) T11COL2,
  NVL(V2.T11COL3,V1.T11COL3) T11COL3,
  V2.T2COL1,
  V2.T2COL2
FROM
  (SELECT
    T1.T1JOIN1,
    T1.T1COL1,
    T1.T1COL2,
    T1.T1COL3,
    T1.T11JOIN1,
    T1.T11COL1,
    T1.T11COL2,
    T1.T11COL3,
    T2.T2COL1,
    T2.T2COL2,
    T2.RN
  FROM
    (SELECT
      T1.T1JOIN1,
      T1.T1COL1,
      T1.T1COL2,
      T1.T1COL3,
      T11.T1JOIN1 T11JOIN1,
      T11.T1COL1 T11COL1,
      T11.T1COL2 T11COL2,
      T11.T1COL3 T11COL3
    FROM
      T1,
      T1 T11
    WHERE
      T1.T1COL2=T11.T1COL2
      AND T1.T1JOIN1<T11.T1JOIN1) T1,
    (SELECT
      T2.T1JOIN1,
      T2.T2COL1,
      T2.T2COL2,
      ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
    FROM
      T2) T2
  WHERE
    T1.T1JOIN1=T2.T1JOIN1) V1
FULL OUTER JOIN
  (SELECT
    T1.T1JOIN1,
    T1.T1COL1,
    T1.T1COL2,
    T1.T1COL3,
    T1.T11JOIN1,
    T1.T11COL1,
    T1.T11COL2,
    T1.T11COL3,
    T2.T2COL1,
    T2.T2COL2,
    T2.RN
  FROM
    (SELECT
      T1.T1JOIN1,
      T1.T1COL1,
      T1.T1COL2,
      T1.T1COL3,
      T11.T1JOIN1 T11JOIN1,
      T11.T1COL1 T11COL1,
      T11.T1COL2 T11COL2,
      T11.T1COL3 T11COL3
    FROM
      T1,
      T1 T11
    WHERE
      T1.T1COL2=T11.T1COL2
      AND T1.T1JOIN1<T11.T1JOIN1) T1,
    (SELECT
      T2.T1JOIN1,
      T2.T2COL1,
      T2.T2COL2,
      ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
    FROM
      T2) T2
  WHERE
    T1.T11JOIN1=T2.T1JOIN1) V2
ON
  V1.T1COL1=V2.T11COL1
  AND V1.RN=V2.RN;

T1JOIN1 T1COL T1COL T1COL T2COL1  T2COL   T11JOIN1 T11CO T11CO T11CO
T2COL1 T2COL
------- ----- ----- ----- ------- ----- ---------- ----- ----- -----
------------
      1 ABC   123   Test  Charlie YYY            3 ABC   123   Test3
Fred   CCC
      1 ABC   123   Test  Fred    XYZ            3 ABC   123   Test3
Jack   EEE
      1 ABC   123   Test                         3 ABC   123   Test3
Joan   DDD
      1 ABC   123   Test                         3 ABC   123   Test3
Rob    EEE
      2 DEF   456   Test2 Jane    ABC            4 DEF   456   Test4
Jane   FFF
      2 DEF   456   Test2 Martha  ABC            4 DEF   456   Test4

Looks like I guessed wrong on the ORDER BY in the ROW_NUMBER
analytical function, I probably should have ordered by T2COL2 - but
that should be an easy fix.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
Received on Sun Mar 02 2008 - 21:37:35 CST

Original text of this message