Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Help, delete dulications

Re: Help, delete dulications

From: Randy DeWoolfson <randyd_at_cais.com>
Date: 1997/03/13
Message-ID: <33288D56.5640@cais.com>#1/1

Mark Rosenbaum wrote:
>
> In article <5g6k08$sje$1_at_spasmolytic.openix.com>, <shu_at_openix.com> wrote:
> >Hi, all:
> >
> >This is the second time I post this message, I guess I didn't
> >make my question clear in the first time. Anyway, here is my
> >question:
> >
> >My application will join a Customer table with a Contact Table
> >and then create a new table. Since the there are multiple
> >entries(rows) in the Contact table can exist for the same
> >customer, the new table will be duplicated on some fields from
> >Customer table but the entire row still remain unique.
> >
> >What do I want is to remove those rows contain duplicated
> >fields so that the new table will contains only one row
> >for each customer. What is the fastest way to delete such kind
> >of duplications? Thank in advance.
> >
> >
> >Shaojie Hu
> >
>
> I suspect that there is more here than meets the eye. First
> Customers are generally name & address so there may be dups
> that have different address. Second what additional information
> besides the unique customer info are you planning on using for
> the new table and what algorithm are you going to use to determine
> which set of fields you are going to use?
>
> These issues will have some impact on how you dedup your tables.
>
> Hope this helps
>
> Mark Rosenbaum Otey-Rosenbaum & Frazier, Inc.
> mjr_at_netcom.com Consultants in High Performance and
> (303) 727-7956 Scalable Computing and Applications
> POB 1397 ftp://ftp.netcom.com/pub/mj/mjr/resume/
> Boulder CO 80306

Some hints without code:

Write a query to get all records, then MINUS operator, then distinct records with MIN rowids.

something like this:

DELETE d WHERE a,b,c IN
SELECT a,b,c FROM d MINUS ( select a,b,c,min(rowid) from d(2) GFROUP BY a,b,c )

or better yet, implement constraints on the new table before the update, and let the bad guys fail to load in the first place...

good luck

Randy :) Received on Thu Mar 13 1997 - 00:00:00 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US