Re: Need oracle expert - SQL
Date: Sun, 2 Mar 2008 19:37:35 -0800 (PST)
Message-ID: <3edf9a9d-2ac7-4ea8-a738-64cdd5494653@n36g2000hse.googlegroups.com>
On Mar 2, 4:18 pm, marfi95 <marf..._at_yahoo.com> wrote:
> All,
>
> I am in need of some expert oracle advice on how to solve the
> following problem. Any ideas would be most helpful.
>
> What I need to do is find duplicate rows in 2 joined tables (just
> certain columns are checked for dup, not all) and
> return information from both tables on a single row. However, my
> requirement is not just to report the duplicate data, but to report
> other data from the two tables as well. The problem I run into is
> that I need to return the data on a single row back to my application
> in a ref cursor, but table 2 can have multiple rows per table 1 based
> on the join. So, the original record could have more rows (because of
> table 2) than the duplicate or the other way. I need to report both
> back to the user.
>
> Here is an example of the data, what is being used in the dup check
> and how I need the data returned to my application. I'm open to any
> ideas, creating temp tables on the fly, etc... One way I thought of
> was selecting part of the duplicate data into a temp table, then
> joining that with the other info and pivoting it. I'm hoping there
> are simpler ways to do this that I'm not thinking of. I'm not an
> Oracle expert. The other thing I've done is created a view that
> contains the join of table 1 and 2 already.
>
> Hopefully, this will make sense.
>
> The selection criteria is t1col1, t1col2 from table 1 have to match
> another row from table 1, but only 1 value (not all) from table2 has
> to match any value from table 2. It doesn't matter what order they are
> either as illustrated in the example. I, however, need to report all
> rows from table 2, even though only 1 matches. Crazy requirements, but
> thats our user.
>
> Remember table 1 and 2 are joined before the dup check. The base
> criteria will only use 1 and 2 as the original based on some other
> data, so that is why 3 and 4 will not show as original records in the
> example.
>
> Table1:
> ==========
>
> T1Join1 T1Col1 T1Col2 T1Col3
> ------- ---------- ---------- -----------
> 1 ABC 123 Test
> 2 DEF 456 Test2
> 3 ABC 123 Test3
> 4 DEF 456 Test4
>
> Table2:
> ===========
>
> T2Join1 T2Col1 T2Col2
> -------- --------- -------
> 1 Fred XYZ
> 1 Charlie YYY
> 2 Martha ABC
> 2 Jane ABC
> 3 Fred CCC
> 3 Joan DDD
> 3 Jack EEE
> 3 Rob EEE
> 4 Jane FFF
>
> Result Needed:
> ===============
>
> T1Join1 T1Col1 T1Col2 T1Col3 T2Col1 T2Col2 T1Join1 T1Col1
> T1Col2 T1Col3 T2Col1 T2Col2
> ---------- ------ ------ ------ -----
> ------ ------ ------ ------ ------
> ------ ------
> 1 ABC 123 Test Fred XYZ 3
> ABC 123 Test3 Fred CCC
> 1 ABC 123 Test Charlie YYY 3
> ABC 123 Test3 Joan DDD
> 1 ABC 123 Test NULL NULL 3
> ABC 123 Test3 Jack EEE
> 1 ABC 123 Test NULL NULL 3
> ABC 123 Test3 Rob EEE
> 2 DEF 456 Test2 Martha ABC 4
> DEF 456 Test4 Jane FFF
> 2 DEF 456 Test2 Jane ABC 4
> DEF 456 Test4 NULL NULL
>
> Hopefully this makes sense and people will take this as a challenge !
>
> Thanks !
> Marc
This looks complicated. It would have been helpful if you had
included the DDL and DML:
CREATE TABLE T1(
T1JOIN1 NUMBER(10), T1COL1 VARCHAR2(5), T1COL2 VARCHAR2(5), T1COL3 VARCHAR2(5));
T1Join1 T1Col1 T1Col2 T1Col3
------- -------- ------- -----------
INSERT INTO T1 VALUES (1,'ABC','123','Test'); INSERT INTO T1 VALUES (2,'DEF','456','Test2'); INSERT INTO T1 VALUES (3,'ABC','123','Test3'); INSERT INTO T1 VALUES (4,'DEF','456','Test4');
CREATE TABLE T2(
T1JOIN1 NUMBER(10), T2COL1 VARCHAR2(7), T2COL2 VARCHAR2(5)); INSERT INTO T2 VALUES (1,'Fred','XYZ'); INSERT INTO T2 VALUES (1,'Charlie','YYY'); INSERT INTO T2 VALUES (2,'Martha','ABC'); INSERT INTO T2 VALUES (2,'Jane','ABC'); INSERT INTO T2 VALUES (3,'Fred','CCC'); INSERT INTO T2 VALUES (3,'Joan','DDD'); INSERT INTO T2 VALUES (3,'Jack','EEE');
INSERT INTO T2 VALUES (3,'Rob','EEE');
INSERT INTO T2 VALUES (4,'Jane','FFF');
COMMIT; First, a test to see if we can fix up table T1 so that the rows that are related by T1COL are joined together. We will reference the T1 table twice, giving it an alias for the second reference: SELECT
T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T11.T1JOIN1 T11JOIN1, T11.T1COL1 T11COL1,
T11.T1COL2 T11COL2,
T11.T1COL3 T11COL3
FROM
T1,
T1 T11
WHERE
T1.T1COL2=T11.T1COL2
AND T1.T1JOIN1<T11.T1JOIN1;
T1JOIN1 T1COL T1COL T1COL T11JOIN1 T11CO T11CO T11CO ---------- ----- ----- ----- ---------- ----- ----- -----
1 ABC 123 Test 3 ABC 123 Test3 2 DEF 456 Test2 4 DEF 456 Test4
Now, let's look at table T2. It would be helpful if there were a counter that counts up from 1 for each of the rows with the same value for T1JOIN1, and I elected to order the rows by T2COL1: SELECT
T2.T1JOIN1, T2.T2COL1, T2.T2COL2,
ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM
T2;
T1JOIN1 T2COL1 T2COL RN
---------- ------- ----- --
1 Charlie YYY 1 1 Fred XYZ 2 2 Jane ABC 1 2 Martha ABC 2 3 Fred CCC 1 3 Jack EEE 2 3 Joan DDD 3 3 Rob EEE 4 4 Jane FFF 1
Now we have a bit of a problem in that we need to join these two
SELECT statements together twice with different join conditions, and
then outer join the results. The first of the two joins looks like
this:
SELECT
T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T1.T11JOIN1, T1.T11COL1, T1.T11COL2, T1.T11COL3, T2.T2COL1, T2.T2COL2, T2.RN
FROM
(SELECT
T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T11.T1JOIN1 T11JOIN1, T11.T1COL1 T11COL1, T11.T1COL2 T11COL2, T11.T1COL3 T11COL3 FROM T1, T1 T11 WHERE T1.T1COL2=T11.T1COL2 AND T1.T1JOIN1<T11.T1JOIN1) T1, (SELECT T2.T1JOIN1, T2.T2COL1, T2.T2COL2, ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM T2) T2
WHERE
T1.T1JOIN1=T2.T1JOIN1; T1JOIN1 T1COL T1COL T1COL T11JOIN1 T11CO T11CO T11CO T2COL1 T2COL RN
------- ----- ----- ----- ---------- ----- ----- ----- ------- -----
-- 1 ABC 123 Test 3 ABC 123 Test3 Fred XYZ 1 1 ABC 123 Test 3 ABC 123 Test3 Charlie YYY 2 2 DEF 456 Test2 4 DEF 456 Test4 Martha ABC 1 2 DEF 456 Test2 4 DEF 456 Test4 Jane ABC 2 The second time we join the two SELECT statements together, we will join with T11.T1JOIN1=T2.T1JOIN1. We will then full outer join the results such that T1COL1 from the one SELECT is joined with T11COL1 from the second SELECT and RN from the first SELECT is joined to RN from the second SELECT: SELECT V1.T1JOIN1, V1.T1COL1, V1.T1COL2, V1.T1COL3, V1.T2COL1, V1.T2COL2, V2.T11JOIN1, V2.T11COL1, V2.T11COL2, V2.T11COL3, V2.T2COL1, V2.T2COL2 FROM (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T2.T1JOIN1 T2_T1JOIN1, T2.T2COL1, T2.T2COL2, T2.RN FROM (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T11.T1JOIN1 T11JOIN1, T11.T1COL1 T11COL1, T11.T1COL2 T11COL2, T11.T1COL3 T11COL3 FROM T1, T1 T11 WHERE T1.T1COL2=T11.T1COL2 AND T1.T1JOIN1<T11.T1JOIN1) T1, (SELECT T2.T1JOIN1, T2.T2COL1, T2.T2COL2, ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM T2) T2 WHERE T1.T1JOIN1=T2.T1JOIN1) V1 FULL OUTER JOIN (SELECT T1.T11JOIN1, T1.T11COL1, T1.T11COL2, T1.T11COL3, T2.T1JOIN1, T2.T2COL1, T2.T2COL2, T2.RN FROM (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T11.T1JOIN1 T11JOIN1, T11.T1COL1 T11COL1, T11.T1COL2 T11COL2, T11.T1COL3 T11COL3 FROM T1, T1 T11 WHERE T1.T1COL2=T11.T1COL2 AND T1.T1JOIN1<T11.T1JOIN1) T1, (SELECT T2.T1JOIN1, T2.T2COL1, T2.T2COL2, ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM T2) T2 WHERE T1.T11JOIN1=T2.T1JOIN1) V2 ON V1.T1COL1=V2.T11COL1 AND V1.RN=V2.RN; T1JOIN1 T1COL T1COL T1COL T2COL1 T2COL T11JOIN1 T11CO T11CO T11CO T2COL1 T2COL ---------- ----- ----- ----- ------- ----- ---------- ----- ----- ----- ------- ----- 1 ABC 123 Test Charlie YYY 3 ABC 123 Test3 Fred CCC 1 ABC 123 Test Fred XYZ 3 ABC 123 Test3 Jack EEE 3 ABC 123 Test3 Joan DDD 3 ABC 123 Test3 Rob EEE 2 DEF 456 Test2 Jane ABC 4 DEF 456 Test4 Jane FFF 2 DEF 456 Test2 Martha ABC But, we have a problem. You specified that the first three columns could not be NULL, nor could columns 7-9. Reworking the SQL statement using NVL: SELECT NVL(V1.T1JOIN1,V2.T1JOIN1) T1JOIN1, NVL(V1.T1COL1,V2.T1COL1) T1COL1, NVL(V1.T1COL2,V2.T1COL2) T1COL2, NVL(V1.T1COL3,V2.T1COL3) T1COL3, V1.T2COL1, V1.T2COL2, NVL(V2.T11JOIN1,V1.T11JOIN1) T11JOIN1, NVL(V2.T11COL1,V1.T11COL1) T11COL1, NVL(V2.T11COL2,V1.T11COL2) T11COL2, NVL(V2.T11COL3,V1.T11COL3) T11COL3, V2.T2COL1, V2.T2COL2 FROM (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T1.T11JOIN1, T1.T11COL1, T1.T11COL2, T1.T11COL3, T2.T2COL1, T2.T2COL2, T2.RN FROM (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T11.T1JOIN1 T11JOIN1, T11.T1COL1 T11COL1, T11.T1COL2 T11COL2, T11.T1COL3 T11COL3 FROM T1, T1 T11 WHERE T1.T1COL2=T11.T1COL2 AND T1.T1JOIN1<T11.T1JOIN1) T1, (SELECT T2.T1JOIN1, T2.T2COL1, T2.T2COL2, ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM T2) T2 WHERE T1.T1JOIN1=T2.T1JOIN1) V1 FULL OUTER JOIN (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T1.T11JOIN1, T1.T11COL1, T1.T11COL2, T1.T11COL3, T2.T2COL1, T2.T2COL2, T2.RN FROM (SELECT T1.T1JOIN1, T1.T1COL1, T1.T1COL2, T1.T1COL3, T11.T1JOIN1 T11JOIN1, T11.T1COL1 T11COL1, T11.T1COL2 T11COL2, T11.T1COL3 T11COL3 FROM T1, T1 T11 WHERE T1.T1COL2=T11.T1COL2 AND T1.T1JOIN1<T11.T1JOIN1) T1, (SELECT T2.T1JOIN1, T2.T2COL1, T2.T2COL2, ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN FROM T2) T2 WHERE T1.T11JOIN1=T2.T1JOIN1) V2 ON V1.T1COL1=V2.T11COL1 AND V1.RN=V2.RN; T1JOIN1 T1COL T1COL T1COL T2COL1 T2COL T11JOIN1 T11CO T11CO T11CO T2COL1 T2COL ------- ----- ----- ----- ------- ----- ---------- ----- ----- ----- ------------ 1 ABC 123 Test Charlie YYY 3 ABC 123 Test3 Fred CCC 1 ABC 123 Test Fred XYZ 3 ABC 123 Test3 Jack EEE 1 ABC 123 Test 3 ABC 123 Test3 Joan DDD 1 ABC 123 Test 3 ABC 123 Test3 Rob EEE 2 DEF 456 Test2 Jane ABC 4 DEF 456 Test4 Jane FFF 2 DEF 456 Test2 Martha ABC 4 DEF 456 Test4 Looks like I guessed wrong on the ORDER BY in the ROW_NUMBER analytical function, I probably should have ordered by T2COL2 - but that should be an easy fix. Charles Hooper IT Manager/Oracle DBA K&M Machine-Fabricating, Inc.Received on Sun Mar 02 2008 - 21:37:35 CST