Re: Need oracle expert - SQL
Date: Sun, 2 Mar 2008 20:31:18 -0800 (PST)
Message-ID: <66a7793c-168c-4351-a74f-998086ffaed1@o77g2000hsf.googlegroups.com>
On Mar 2, 9:51 pm, marfi95 <marf..._at_yahoo.com> wrote:
> On Mar 2, 9:37 pm, Charles Hooper <hooperc2..._at_yahoo.com> wrote:
>
>
>
> > On Mar 2, 4:18 pm, marfi95 <marf..._at_yahoo.com> wrote:
>
> > > All,
>
> > > I am in need of some expert oracle advice on how to solve the
> > > following problem. Any ideas would be most helpful.
>
> > > What I need to do is find duplicate rows in 2 joined tables (just
> > > certain columns are checked for dup, not all) and
> > > return information from both tables on a single row. However, my
> > > requirement is not just to report the duplicate data, but to report
> > > other data from the two tables as well. The problem I run into is
> > > that I need to return the data on a single row back to my application
> > > in a ref cursor, but table 2 can have multiple rows per table 1 based
> > > on the join. So, the original record could have more rows (because of
> > > table 2) than the duplicate or the other way. I need to report both
> > > back to the user.
>
> > > Here is an example of the data, what is being used in the dup check
> > > and how I need the data returned to my application. I'm open to any
> > > ideas, creating temp tables on the fly, etc... One way I thought of
> > > was selecting part of the duplicate data into a temp table, then
> > > joining that with the other info and pivoting it. I'm hoping there
> > > are simpler ways to do this that I'm not thinking of. I'm not an
> > > Oracle expert. The other thing I've done is created a view that
> > > contains the join of table 1 and 2 already.
>
> > > Hopefully, this will make sense.
>
> > > The selection criteria is t1col1, t1col2 from table 1 have to match
> > > another row from table 1, but only 1 value (not all) from table2 has
> > > to match any value from table 2. It doesn't matter what order they are
> > > either as illustrated in the example. I, however, need to report all
> > > rows from table 2, even though only 1 matches. Crazy requirements, but
> > > thats our user.
>
> > > Remember table 1 and 2 are joined before the dup check. The base
> > > criteria will only use 1 and 2 as the original based on some other
> > > data, so that is why 3 and 4 will not show as original records in the
> > > example.
>
> > > Table1:
> > > ==========
>
> > > T1Join1 T1Col1 T1Col2 T1Col3
> > > ------- ---------- ---------- -----------
> > > 1 ABC 123 Test
> > > 2 DEF 456 Test2
> > > 3 ABC 123 Test3
> > > 4 DEF 456 Test4
>
> > > Table2:
> > > ===========
>
> > > T2Join1 T2Col1 T2Col2
> > > -------- --------- -------
> > > 1 Fred XYZ
> > > 1 Charlie YYY
> > > 2 Martha ABC
> > > 2 Jane ABC
> > > 3 Fred CCC
> > > 3 Joan DDD
> > > 3 Jack EEE
> > > 3 Rob EEE
> > > 4 Jane FFF
>
> > > Result Needed:
> > > ===============
>
> > > T1Join1 T1Col1 T1Col2 T1Col3 T2Col1 T2Col2 T1Join1 T1Col1
> > > T1Col2 T1Col3 T2Col1 T2Col2
> > > ---------- ------ ------ ------ -----
> > > ------ ------ ------ ------ ------
> > > ------ ------
> > > 1 ABC 123 Test Fred XYZ 3
> > > ABC 123 Test3 Fred CCC
> > > 1 ABC 123 Test Charlie YYY 3
> > > ABC 123 Test3 Joan DDD
> > > 1 ABC 123 Test NULL NULL 3
> > > ABC 123 Test3 Jack EEE
> > > 1 ABC 123 Test NULL NULL 3
> > > ABC 123 Test3 Rob EEE
> > > 2 DEF 456 Test2 Martha ABC 4
> > > DEF 456 Test4 Jane FFF
> > > 2 DEF 456 Test2 Jane ABC 4
> > > DEF 456 Test4 NULL NULL
>
> > > Hopefully this makes sense and people will take this as a challenge !
>
> > > Thanks !
> > > Marc
>
> > This looks complicated. It would have been helpful if you had
> > included the DDL and DML:
> > CREATE TABLE T1(
> > T1JOIN1 NUMBER(10),
> > T1COL1 VARCHAR2(5),
> > T1COL2 VARCHAR2(5),
> > T1COL3 VARCHAR2(5));
>
> > T1Join1 T1Col1 T1Col2 T1Col3
> > ------- -------- ------- -----------
> > INSERT INTO T1 VALUES (1,'ABC','123','Test');
> > INSERT INTO T1 VALUES (2,'DEF','456','Test2');
> > INSERT INTO T1 VALUES (3,'ABC','123','Test3');
> > INSERT INTO T1 VALUES (4,'DEF','456','Test4');
>
> > CREATE TABLE T2(
> > T1JOIN1 NUMBER(10),
> > T2COL1 VARCHAR2(7),
> > T2COL2 VARCHAR2(5));
>
> > INSERT INTO T2 VALUES (1,'Fred','XYZ');
> > INSERT INTO T2 VALUES (1,'Charlie','YYY');
> > INSERT INTO T2 VALUES (2,'Martha','ABC');
> > INSERT INTO T2 VALUES (2,'Jane','ABC');
> > INSERT INTO T2 VALUES (3,'Fred','CCC');
> > INSERT INTO T2 VALUES (3,'Joan','DDD');
> > INSERT INTO T2 VALUES (3,'Jack','EEE');
> > INSERT INTO T2 VALUES (3,'Rob','EEE');
> > INSERT INTO T2 VALUES (4,'Jane','FFF');
>
> > COMMIT;
>
> > First, a test to see if we can fix up table T1 so that the rows that
> > are related by T1COL are joined together. We will reference the T1
> > table twice, giving it an alias for the second reference:
> > SELECT
> > T1.T1JOIN1,
> > T1.T1COL1,
> > T1.T1COL2,
> > T1.T1COL3,
> > T11.T1JOIN1 T11JOIN1,
> > T11.T1COL1 T11COL1,
> > T11.T1COL2 T11COL2,
> > T11.T1COL3 T11COL3
> > FROM
> > T1,
> > T1 T11
> > WHERE
> > T1.T1COL2=T11.T1COL2
> > AND T1.T1JOIN1<T11.T1JOIN1;
>
> > T1JOIN1 T1COL T1COL T1COL T11JOIN1 T11CO T11CO T11CO
> > ---------- ----- ----- ----- ---------- ----- ----- -----
> > 1 ABC 123 Test 3 ABC 123 Test3
> > 2 DEF 456 Test2 4 DEF 456 Test4
>
> > Now, let's look at table T2. It would be helpful if there were a
> > counter that counts up from 1 for each of the rows with the same value
> > for T1JOIN1, and I elected to order the rows by T2COL1:
> > SELECT
> > T2.T1JOIN1,
> > T2.T2COL1,
> > T2.T2COL2,
> > ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
> > FROM
> > T2;
>
> > T1JOIN1 T2COL1 T2COL RN
> > ---------- ------- ----- --
> > 1 Charlie YYY 1
> > 1 Fred XYZ 2
> > 2 Jane ABC 1
> > 2 Martha ABC 2
> > 3 Fred CCC 1
> > 3 Jack EEE 2
> > 3 Joan DDD 3
> > 3 Rob EEE 4
> > 4 Jane FFF 1
>
> > Now we have a bit of a problem in that we need to join these two
> > SELECT statements together twice with different join conditions, and
> > then outer join the results. The first of the two joins looks like
> > this:
> > SELECT
> > T1.T1JOIN1,
> > T1.T1COL1,
> > T1.T1COL2,
> > T1.T1COL3,
> > T1.T11JOIN1,
> > T1.T11COL1,
> > T1.T11COL2,
> > T1.T11COL3,
> > T2.T2COL1,
> > T2.T2COL2,
> > T2.RN
> > FROM
> > (SELECT
> > T1.T1JOIN1,
> > T1.T1COL1,
> > T1.T1COL2,
> > T1.T1COL3,
> > T11.T1JOIN1 T11JOIN1,
> > T11.T1COL1 T11COL1,
> > T11.T1COL2 T11COL2,
> > T11.T1COL3 T11COL3
> > FROM
> > T1,
> > T1 T11
> > WHERE
> > T1.T1COL2=T11.T1COL2
> > AND T1.T1JOIN1<T11.T1JOIN1) T1,
> > (SELECT
> > T2.T1JOIN1,
> > T2.T2COL1,
> > T2.T2COL2,
> > ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
> > FROM
> > T2) T2
> > WHERE
> > T1.T1JOIN1=T2.T1JOIN1;
>
> > T1JOIN1 T1COL T1COL T1COL T11JOIN1 T11CO T11CO T11CO T2COL1 T2COL
> > RN
> > ------- ----- ----- ----- ---------- ----- ----- ----- ------- -----
> > --
> > 1 ABC 123 Test 3 ABC 123 Test3 Fred XYZ
> > 1
> > 1 ABC 123 Test 3 ABC 123 Test3 Charlie YYY
> > 2
> > 2 DEF 456 Test2 4 DEF 456 Test4 Martha ABC
> > 1
> > 2 DEF 456 Test2 4 DEF 456 Test4 Jane ABC
> > 2
>
> > The second time we join the two SELECT statements together, we will
> > join with T11.T1JOIN1=T2.T1JOIN1. We will then full outer join the
> > results such that T1COL1 from the one SELECT is joined with T11COL1
> > from the second SELECT and RN from the first SELECT is joined to RN
> > from the second SELECT:
> > SELECT
> > V1.T1JOIN1,
> > V1.T1COL1,
> > V1.T1COL2,
> > V1.T1COL3,
> > V1.T2COL1,
> > V1.T2COL2,
> > V2.T11JOIN1,
> > V2.T11COL1,
> > V2.T11COL2,
> > V2.T11COL3,
> > V2.T2COL1,
> > V2.T2COL2
> > FROM
> > (SELECT
> > T1.T1JOIN1,
> > T1.T1COL1,
> > T1.T1COL2,
> > T1.T1COL3,
> > T2.T1JOIN1 T2_T1JOIN1,
> > T2.T2COL1,
> > T2.T2COL2,
> > T2.RN
> > FROM
> > (SELECT
> > T1.T1JOIN1,
> > T1.T1COL1,
> > T1.T1COL2,
> > T1.T1COL3,
> > T11.T1JOIN1 T11JOIN1,
> > T11.T1COL1 T11COL1,
> > T11.T1COL2 T11COL2,
> > T11.T1COL3 T11COL3
> > FROM
> > T1,
> > T1 T11
> > WHERE
> > T1.T1COL2=T11.T1COL2
> > AND T1.T1JOIN1<T11.T1JOIN1) T1,
> > (SELECT
> > T2.T1JOIN1,
> > T2.T2COL1,
> > T2.T2COL2,
> > ROW_NUMBER() OVER (PARTITION BY T2.T1JOIN1 ORDER BY T2COL1) RN
> > FROM
> > T2) T2
> > WHERE
> > T1.T1JOIN1=T2.T1JOIN1) V1
> > FULL OUTER JOIN
> > (SELECT
> > T1.T11JOIN1,
> > T1.T11COL1,
> > T1.T11COL2,
> > T1.T11COL3,
> > T2.T1JOIN1,
> > T2.T2COL1,
> > T2.T2COL2,
> > T2.RN
> > FROM
> > (SELECT
> > T1.T1JOIN1,
> > T1.T1COL1,
> > T1.T1COL2,
> > T1.T1COL3,
> > T11.T1JOIN1 T11JOIN1,
> > T11.T1COL1 T11COL1,
> > T11.T1COL2 T11COL2,
>
> ...
>
> read more »- Hide quoted text -
>
> - Show quoted text -
Charles,
Can you please explain why this line is needed ?
AND T1.T1JOIN1<T11.T1JOIN1;
In the real application data, the join column will be numeric, but I dont think I can guarantee the order of it.
Also, not sure if you noticed, but I have created a view to already join T1 and T2 together, so would that make your SQL less complex or will it make it more complex ? This doesn't have to be one sql statement, I dont have a problem creating any temp tables that might make it easier.
Again, thanks for the reply ! Received on Sun Mar 02 2008 - 22:31:18 CST