Re: Need oracle expert - SQL

From: marfi95 <marfi95_at_yahoo.com>
Date: Sun, 2 Mar 2008 19:51:40 -0800 (PST)
Message-ID: <4e585536-a2ed-4489-ba45-d5b16efe8e0c@b1g2000hsg.googlegroups.com>


On Mar 2, 9:37 pm, Charles Hooper <hooperc2..._at_yahoo.com> wrote:
> On Mar 2, 4:18 pm, marfi95 <marf..._at_yahoo.com> wrote:
>
> > All,
>
> > I am in need of some expert oracle advice on how to solve the
> > following problem.  Any ideas would be most helpful.
>
> > What I need to do is find duplicate rows in 2 joined tables (just
> > certain columns are checked for dup, not all) and
> > return information from both tables on a single row.  However, my
> > requirement is not just to report the duplicate data, but to report
> > other data from the two tables as well.  The problem I run into is
> > that I need to return the data on a single row back to my application
> > in a ref cursor, but table 2 can have multiple rows per table 1 based
> > on the join.  So, the original record could have more rows (because of
> > table 2) than the duplicate or the other way.  I need to report both
> > back to the user.
>
> > Here is an example of the data, what is being used in the dup check
> > and how I need the data returned to my application.  I'm open to any
> > ideas, creating temp tables on the fly, etc...  One way I thought of
> > was selecting part of the duplicate data into a temp table, then
> > joining that with the other info and pivoting it.  I'm hoping there
> > are simpler ways to do this that I'm not thinking of.  I'm not an
> > Oracle expert.  The other thing I've done is created a view that
> > contains the join of table 1 and 2 already.
>
> > Hopefully, this will make sense.
>
> > The selection criteria is t1col1, t1col2 from table 1 have to match
> > another row from table 1, but only 1 value (not all) from table2 has
> > to match any value from table 2. It doesn't matter what order they are
> > either as illustrated in the example.  I, however, need to report all
> > rows from table 2, even though only 1 matches. Crazy requirements, but
> > thats our user.
>
> > Remember table 1 and 2 are joined before the dup check.  The base
> > criteria will only use 1 and 2 as the original based on some other
> > data, so that is why 3 and 4 will not show as original records in the
> > example.
>
> > Table1:
> > ==========
>
> > T1Join1   T1Col1   T1Col2   T1Col3
> > -------       ----------   ----------    -----------
> > 1            ABC      123         Test
> > 2            DEF      456         Test2
> > 3            ABC      123         Test3
> > 4            DEF      456         Test4
>
> > Table2:
> > ===========
>
> > T2Join1 T2Col1   T2Col2
> > --------    ---------    -------
> > 1          Fred       XYZ
> > 1          Charlie   YYY
> > 2          Martha   ABC
> > 2          Jane      ABC
> > 3          Fred      CCC
> > 3          Joan      DDD
> > 3          Jack      EEE
> > 3          Rob       EEE
> > 4          Jane      FFF
>
> > Result Needed:
> > ===============
>
> > T1Join1  T1Col1  T1Col2  T1Col3  T2Col1  T2Col2  T1Join1  T1Col1
> > T1Col2 T1Col3     T2Col1      T2Col2
> > ----------   ------      ------      ------      -----
> > ------      ------      ------        ------     ------
> > ------          ------
> > 1           ABC     123       Test      Fred      XYZ      3
> > ABC       123       Test3        Fred        CCC
> > 1           ABC     123       Test      Charlie   YYY     3
> > ABC        123      Test3        Joan         DDD
> > 1           ABC     123       Test      NULL     NULL    3
> > ABC        123       Test3        Jack         EEE
> > 1           ABC     123       Test      NULL     NULL    3
> > ABC        123       Test3        Rob          EEE
> > 2           DEF     456       Test2     Martha  ABC      4
> > DEF        456       Test4        Jane         FFF
> > 2           DEF     456       Test2     Jane      ABC     4
> > DEF        456       Test4        NULL        NULL
>
> > Hopefully this makes sense and people will take this as a challenge !
>
> > Thanks !
> > Marc
>
> This looks complicated.  It would have been helpful if you had
> included the DDL and DML:
> CREATE TABLE T1(
>   T1JOIN1 NUMBER(10),
>   T1COL1 VARCHAR2(5),
>   T1COL2 VARCHAR2(5),
>   T1COL3 VARCHAR2(5));
>
> T1Join1 T1Col1   T1Col2  T1Col3
> ------- -------- ------- -----------
> INSERT INTO T1 VALUES (1,'ABC','123','Test');
> INSERT INTO T1 VALUES (2,'DEF','456','Test2');
> INSERT INTO T1 VALUES (3,'ABC','123','Test3');
> INSERT INTO T1 VALUES (4,'DEF','456','Test4');
>
> CREATE TABLE T2(
>   T1JOIN1 NUMBER(10),
>   T2COL1 VARCHAR2(7),
>   T2COL2 VARCHAR2(5));
>
> INSERT INTO T2 VALUES (1,'Fred','XYZ');
> INSERT INTO T2 VALUES (1,'Charlie','YYY');
> INSERT INTO T2 VALUES (2,'Martha','ABC');
> INSERT INTO T2 VALUES (2,'Jane','ABC');
> INSERT INTO T2 VALUES (3,'Fred','CCC');
> INSERT INTO T2 VALUES (3,'Joan','DDD');
> INSERT INTO T2 VALUES (3,'Jack','EEE');
> INSERT INTO T2 VALUES (3,'Rob','EEE');
> INSERT INTO T2 VALUES (4,'Jane','FFF');
>
> COMMIT;
>
> First, a test to see if we can fix up table T1 so that the rows that
> are related by T1COL are joined together.  We will reference the T1
> table twice, giving it an alias for the second reference:
> SELECT
>   T1.T1JOIN1,
>   T1.T1COL1,
>   T1.T1COL2,
>   T1.T1COL3,
>   T11.T1JOIN1 T11JOIN1,
>   T11.T1COL1 T11COL1,
>   T11.T1COL2 T11COL2,
>   T11.T1COL3 T11COL3
> FROM
>   T1,
>   T1 T11
> WHERE
>   T1.T1COL2=T11.T1COL2
>   AND T1.T1JOIN1<T11.T1JOIN1;
>
>    T1JOIN1 T1COL T1COL T1COL   T11JOIN1 T11CO T11CO T11CO
> ---------- ----- ----- ----- ---------- ----- ----- -----
>          1 ABC   123   Test           3 ABC   123   Test3
>          2 DEF   456   Test2          4 DEF   456   Test4
>
> Now, let's look at table T2.  It would be helpful if there were a
> counter that counts up from 1 for each of the rows with the same value
> for T1JOIN1, and I elected to order the rows by T2COL1:
> SELECT
>   T2.T1JOIN1,
>   T2.T2COL1,
>   T2.T2COL2,
>   ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
> FROM
>   T2;
>
>    T1JOIN1 T2COL1  T2COL RN
> ---------- ------- ----- --
>          1 Charlie YYY    1
>          1 Fred    XYZ    2
>          2 Jane    ABC    1
>          2 Martha  ABC    2
>          3 Fred    CCC    1
>          3 Jack    EEE    2
>          3 Joan    DDD    3
>          3 Rob     EEE    4
>          4 Jane    FFF    1
>
> Now we have a bit of a problem in that we need to join these two
> SELECT statements together twice with different join conditions, and
> then outer join the results.  The first of the two joins looks like
> this:
>   SELECT
>     T1.T1JOIN1,
>     T1.T1COL1,
>     T1.T1COL2,
>     T1.T1COL3,
>     T1.T11JOIN1,
>     T1.T11COL1,
>     T1.T11COL2,
>     T1.T11COL3,
>     T2.T2COL1,
>     T2.T2COL2,
>     T2.RN
>   FROM
>     (SELECT
>       T1.T1JOIN1,
>       T1.T1COL1,
>       T1.T1COL2,
>       T1.T1COL3,
>       T11.T1JOIN1 T11JOIN1,
>       T11.T1COL1 T11COL1,
>       T11.T1COL2 T11COL2,
>       T11.T1COL3 T11COL3
>     FROM
>       T1,
>       T1 T11
>     WHERE
>       T1.T1COL2=T11.T1COL2
>       AND T1.T1JOIN1<T11.T1JOIN1) T1,
>     (SELECT
>       T2.T1JOIN1,
>       T2.T2COL1,
>       T2.T2COL2,
>       ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
>     FROM
>       T2) T2
>   WHERE
>     T1.T1JOIN1=T2.T1JOIN1;
>
> T1JOIN1 T1COL T1COL T1COL   T11JOIN1 T11CO T11CO T11CO T2COL1  T2COL
> RN
> ------- ----- ----- ----- ---------- ----- ----- ----- ------- -----
> --
>       1 ABC   123   Test           3 ABC   123   Test3 Fred    XYZ
> 1
>       1 ABC   123   Test           3 ABC   123   Test3 Charlie YYY
> 2
>       2 DEF   456   Test2          4 DEF   456   Test4 Martha  ABC
> 1
>       2 DEF   456   Test2          4 DEF   456   Test4 Jane    ABC
> 2
>
> The second time we join the two SELECT statements together, we will
> join with T11.T1JOIN1=T2.T1JOIN1.  We will then full outer join the
> results such that T1COL1 from the one SELECT is joined with T11COL1
> from the second SELECT and RN from the first SELECT is joined to RN
> from the second SELECT:
> SELECT
>   V1.T1JOIN1,
>   V1.T1COL1,
>   V1.T1COL2,
>   V1.T1COL3,
>   V1.T2COL1,
>   V1.T2COL2,
>   V2.T11JOIN1,
>   V2.T11COL1,
>   V2.T11COL2,
>   V2.T11COL3,
>   V2.T2COL1,
>   V2.T2COL2
> FROM
>   (SELECT
>     T1.T1JOIN1,
>     T1.T1COL1,
>     T1.T1COL2,
>     T1.T1COL3,
>     T2.T1JOIN1 T2_T1JOIN1,
>     T2.T2COL1,
>     T2.T2COL2,
>     T2.RN
>   FROM
>     (SELECT
>       T1.T1JOIN1,
>       T1.T1COL1,
>       T1.T1COL2,
>       T1.T1COL3,
>       T11.T1JOIN1 T11JOIN1,
>       T11.T1COL1 T11COL1,
>       T11.T1COL2 T11COL2,
>       T11.T1COL3 T11COL3
>     FROM
>       T1,
>       T1 T11
>     WHERE
>       T1.T1COL2=T11.T1COL2
>       AND T1.T1JOIN1<T11.T1JOIN1) T1,
>     (SELECT
>       T2.T1JOIN1,
>       T2.T2COL1,
>       T2.T2COL2,
>       ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
>     FROM
>       T2) T2
>   WHERE
>     T1.T1JOIN1=T2.T1JOIN1) V1
> FULL OUTER JOIN
>   (SELECT
>     T1.T11JOIN1,
>     T1.T11COL1,
>     T1.T11COL2,
>     T1.T11COL3,
>     T2.T1JOIN1,
>     T2.T2COL1,
>     T2.T2COL2,
>     T2.RN
>   FROM
>     (SELECT
>       T1.T1JOIN1,
>       T1.T1COL1,
>       T1.T1COL2,
>       T1.T1COL3,
>       T11.T1JOIN1 T11JOIN1,
>       T11.T1COL1 T11COL1,
>       T11.T1COL2 T11COL2,
>       T11.T1COL3 T11COL3
>     FROM
>       T1,
>       T1 T11
>     WHERE
>       T1.T1COL2=T11.T1COL2
>       AND T1.T1JOIN1<T11.T1JOIN1) T1,
>     (SELECT
>       T2.T1JOIN1,
>       T2.T2COL1,
>       T2.T2COL2,
>       ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
>     FROM
>       T2) T2
>   WHERE
>     T1.T11JOIN1=T2.T1JOIN1) V2
> ON
>   V1.T1COL1=V2.T11COL1
>   AND V1.RN=V2.RN;
>
>    T1JOIN1 T1COL T1COL T1COL T2COL1  T2COL   T11JOIN1 T11CO T11CO
> T11CO T2COL1  T2COL
> ---------- ----- ----- ----- ------- ----- ---------- ----- -----
> ----- ------- -----
>          1 ABC   123   Test  Charlie YYY            3 ABC   123
> Test3 Fred    CCC
>          1 ABC   123   Test  Fred    XYZ            3 ABC   123
> Test3 Jack    EEE
>                                                     3 ABC   123
> Test3 Joan    DDD
>                                                     3 ABC   123
> Test3 Rob     EEE
>          2 DEF   456   Test2 Jane    ABC            4 DEF   456
> Test4 Jane    FFF
>          2 DEF   456   Test2 Martha  ABC
>
> But, we have a problem.  You specified that the first three columns
> could not be NULL, nor could columns 7-9.  Reworking the SQL statement
> using NVL:
> SELECT
>   NVL(V1.T1JOIN1,V2.T1JOIN1) T1JOIN1,
>   NVL(V1.T1COL1,V2.T1COL1) T1COL1,
>   NVL(V1.T1COL2,V2.T1COL2) T1COL2,
>   NVL(V1.T1COL3,V2.T1COL3) T1COL3,
>   V1.T2COL1,
>   V1.T2COL2,
>   NVL(V2.T11JOIN1,V1.T11JOIN1) T11JOIN1,
>   NVL(V2.T11COL1,V1.T11COL1) T11COL1,
>   NVL(V2.T11COL2,V1.T11COL2) T11COL2,
>   NVL(V2.T11COL3,V1.T11COL3) T11COL3,
>   V2.T2COL1,
>   V2.T2COL2
> FROM
>   (SELECT
>     T1.T1JOIN1,
>     T1.T1COL1,
>     T1.T1COL2,
>     T1.T1COL3,
>     T1.T11JOIN1,
>     T1.T11COL1,
>     T1.T11COL2,
>     T1.T11COL3,
>     T2.T2COL1,
>     T2.T2COL2,
>     T2.RN
>   FROM
>     (SELECT
>       T1.T1JOIN1,
>       T1.T1COL1,
>       T1.T1COL2,
>       T1.T1COL3,
>       T11.T1JOIN1 T11JOIN1,
>       T11.T1COL1 T11COL1,
>       T11.T1COL2 T11COL2,
>       T11.T1COL3 T11COL3
>     FROM
>       T1,
>       T1 T11
>     WHERE
>       T1.T1COL2=T11.T1COL2
>       AND T1.T1JOIN1<T11.T1JOIN1) T1,
>     (SELECT
>       T2.T1JOIN1,
>       T2.T2COL1,
>       T2.T2COL2,
>       ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
>     FROM
>       T2) T2
>   WHERE
>     T1.T1JOIN1=T2.T1JOIN1) V1
> FULL OUTER JOIN
>   (SELECT
>     T1.T1JOIN1,
>     T1.T1COL1,
>     T1.T1COL2,
>     T1.T1COL3,
>     T1.T11JOIN1,
>     T1.T11COL1,
>     T1.T11COL2,
>     T1.T11COL3,
>     T2.T2COL1,
>     T2.T2COL2,
>     T2.RN
>   FROM
>     (SELECT
>       T1.T1JOIN1,
>       T1.T1COL1,
>       T1.T1COL2,
>       T1.T1COL3,
>       T11.T1JOIN1 T11JOIN1,
>       T11.T1COL1 T11COL1,
>       T11.T1COL2 T11COL2,
>       T11.T1COL3 T11COL3
>     FROM
>       T1,
>       T1 T11
>     WHERE
>       T1.T1COL2=T11.T1COL2
>       AND T1.T1JOIN1<T11.T1JOIN1) T1,
>     (SELECT
>       T2.T1JOIN1,
>       T2.T2COL1,
>       T2.T2COL2,
>       ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
>     FROM
>       T2) T2
>   WHERE
>     T1.T11JOIN1=T2.T1JOIN1) V2
> ON
>   V1.T1COL1=V2.T11COL1
>   AND V1.RN=V2.RN;
>
> T1JOIN1 T1COL T1COL T1COL T2COL1  T2COL   T11JOIN1 T11CO T11CO T11CO
> T2COL1 T2COL
> ------- ----- ----- ----- ------- ----- ---------- ----- ----- -----
> ------------
>       1 ABC   123   Test  Charlie YYY            3 ABC   123   Test3
> Fred   CCC
>       1 ABC   123   Test  Fred    XYZ            3 ABC   123   Test3
> Jack   EEE
>       1 ABC   123   Test                         3 ABC   123   Test3
> Joan   DDD
>       1 ABC   123   Test                         3 ABC   123   Test3
> Rob    EEE
>       2 DEF   456   Test2 Jane    ABC            4 DEF   456   Test4
> Jane   FFF
>       2 DEF   456   Test2 Martha  ABC            4 DEF   456   Test4
>
> Looks like I guessed wrong on the ORDER BY in the ROW_NUMBER
> analytical function, I probably should have ordered by T2COL2 - but
> that should be an easy fix.
>
> Charles Hooper
> IT Manager/Oracle DBA
> K&M Machine-Fabricating, Inc.

Thanks for the reply. I will definately need to study this,as this is very complex. Would it have been easier to do the first part of the checking and store the results in a temporary table, then join that with the additional checking for the outer joins ? Or would it not have mattered ? Received on Sun Mar 02 2008 - 21:51:40 CST

Original text of this message