Re: Need oracle expert - SQL

From: marfi95 <marfi95_at_yahoo.com>
Date: Sun, 2 Mar 2008 16:28:01 -0800 (PST)
Message-ID: <daaa63a9-086a-4494-9d09-1ce9be5248e9@60g2000hsy.googlegroups.com>


On Mar 2, 5:39 pm, DA Morgan <damor..._at_psoug.org> wrote:
> marfi95 wrote:
> > All,
>
> > I am in need of some expert oracle advice on how to solve the
> > following problem.  Any ideas would be most helpful.
>
> > What I need to do is find duplicate rows in 2 joined tables (just
> > certain columns are checked for dup, not all) and
> > return information from both tables on a single row.  However, my
> > requirement is not just to report the duplicate data, but to report
> > other data from the two tables as well.  The problem I run into is
> > that I need to return the data on a single row back to my application
> > in a ref cursor, but table 2 can have multiple rows per table 1 based
> > on the join.  So, the original record could have more rows (because of
> > table 2) than the duplicate or the other way.  I need to report both
> > back to the user.
>
> > Here is an example of the data, what is being used in the dup check
> > and how I need the data returned to my application.  I'm open to any
> > ideas, creating temp tables on the fly, etc...  One way I thought of
> > was selecting part of the duplicate data into a temp table, then
> > joining that with the other info and pivoting it.  I'm hoping there
> > are simpler ways to do this that I'm not thinking of.  I'm not an
> > Oracle expert.  The other thing I've done is created a view that
> > contains the join of table 1 and 2 already.
>
> > Hopefully, this will make sense.
>
> > The selection criteria is t1col1, t1col2 from table 1 have to match
> > another row from table 1, but only 1 value (not all) from table2 has
> > to match any value from table 2. It doesn't matter what order they are
> > either as illustrated in the example.  I, however, need to report all
> > rows from table 2, even though only 1 matches. Crazy requirements, but
> > thats our user.
>
> > Remember table 1 and 2 are joined before the dup check.  The base
> > criteria will only use 1 and 2 as the original based on some other
> > data, so that is why 3 and 4 will not show as original records in the
> > example.
>
> > Table1:
> > ==========
>
> > T1Join1   T1Col1   T1Col2   T1Col3
> > -------       ----------   ----------    -----------
> > 1            ABC      123         Test
> > 2            DEF      456         Test2
> > 3            ABC      123         Test3
> > 4            DEF      456         Test4
>
> > Table2:
> > ===========
>
> > T2Join1 T2Col1   T2Col2
> > --------    ---------    -------
> > 1          Fred       XYZ
> > 1          Charlie   YYY
> > 2          Martha   ABC
> > 2          Jane      ABC
> > 3          Fred      CCC
> > 3          Joan      DDD
> > 3          Jack      EEE
> > 3          Rob       EEE
> > 4          Jane      FFF
>
> > Result Needed:
> > ===============
>
> > T1Join1  T1Col1  T1Col2  T1Col3  T2Col1  T2Col2  T1Join1  T1Col1
> > T1Col2 T1Col3     T2Col1      T2Col2
> > ----------   ------      ------      ------      -----
> > ------      ------      ------        ------     ------
> > ------          ------
> > 1           ABC     123       Test      Fred      XYZ      3
> > ABC       123       Test3        Fred        CCC
> > 1           ABC     123       Test      Charlie   YYY     3
> > ABC        123      Test3        Joan         DDD
> > 1           ABC     123       Test      NULL     NULL    3
> > ABC        123       Test3        Jack         EEE
> > 1           ABC     123       Test      NULL     NULL    3
> > ABC        123       Test3        Rob          EEE
> > 2           DEF     456       Test2     Martha  ABC      4
> > DEF        456       Test4        Jane         FFF
> > 2           DEF     456       Test2     Jane      ABC     4
> > DEF        456       Test4        NULL        NULL
>
> > Hopefully this makes sense and people will take this as a challenge !
>
> > Thanks !
> > Marc
>
> You will need to do your own homework ... but we can help with hints
> where appropriate.
>
> Hint 1: In what version of the product (to 3+ decimal places)?
>
> Hint 2: Your column names are a clear violation of basic normalization.
> Get help from your instructor on relational table design.
>
> Hint 3: After you have redesigned this mess give it an honest effort
> and then, if you still can't figure it out, post your table design:
> Give us the DDL. Give us insert statements to create the test data,
> and show us your best attempt and explain why it isn't doing what you
> think it should do.
>
> Best wishes with your studies.
>
> PS: If you don't want to anger those who might help you ... post to
> one, and only one, group.
> --
> Daniel A. Morgan
> Oracle Ace Director & Instructor
> University of Washington
> damor..._at_x.washington.edu (replace x with u to respond)
> Puget Sound Oracle Users Groupwww.psoug.org- Hide quoted text -
>
> - Show quoted text -

This is not homework, I wish it was that simple. I've been out of school for many years now. This is for my job, I just haven't used oracle that much. I realize the names are not very good, but I just made up names and data in an attempt to illustrate the scenario. It's really a loan/customer scenario. This is for 10g.

I know the join columns between table 1 and 2 should really be the same name, I just overlooked that. Other than that, nothing wrong with the column names. The issue I am having is because table 2 can have multiple rows per table 1 key.

I need to report any duplicate data on Table 1 (along with its respective Table 2 data). Problem is I need to return both duplicates on a single row returned to the application. I dont feel like I'm explaining this very well.

The result row I pass back to my application needs to contain data from Table 1 and 2 (for both the original and duplicate). If table 2 has has more rows for the duplicate than the original, I need to have NULLS in the row for the table 2 fields for the original (whereas the table 2 fields would have the correct data) and vice versa (if table 2 has more rows for the original, then the NULLS need to be in the table 2 fields for the duplicate).

I have already created a view to simplify some of this. the view is:

create or replace view vw_test
as
select * from table1 t1, table2 t2 where t1.key=t2.key (note I used "key" as the name of the join column this time)

Then the sql I have is:

select distinct

     v1.key, v1.t1col1, v1.t1col2, v1.t1col3, v1.t2col1, v1.t2col2,
     v2.key, v2.t1col1, v2.t1col2, v2.t1col3, v2.t2col1, v2.t2col2
from    vw_test v1, vvw_test v2
where      v1.key <> v2.key
and         v1.t1col1 = v2.t1col1
and         v1.t1col2 = v2.t1col2
and         v1.t2col1 in (select t2col1 from vw_test
                                 where t1col1 = v1.t1col1)

The result I get is if table 2 has more rows for the original than the original, then the table 2 fields for the duplicate are duplicated on each row.
I was able to get rid of the duplicates using the analytic LAG function and decode, but I dont think that rights because it could be valid for the same column on the previous row to be the same value.

Probably didn't explain very well, but the example I provided explains it better. Received on Sun Mar 02 2008 - 18:28:01 CST

Original text of this message