| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
|  |  | |||
Home -> Community -> Usenet -> c.d.o.misc -> Re: DML Needed: Flag Duplicate Records....
Sybrand's suggest of loading data and then scrubbing is the "right" one in
that it creates a pool of clean data with a minimum of fuss:
INSERT INTO clean_table
SELECT DISTINCT...
FROM dirty_table;
This method will move one copy of each record regardless of the number duplicates. Unless you *need* to save the duplicates, this is the best way to handle the problem.
If you *must* use the data where it is or if you must save the duplicates in the same table, and you can only flag duplicates, then you can use the ROWID as a tie breaker.
If most rows have duplicates, this is fastest:
UPDATE dirty_table  SET NOTE ='IS A DUPE';
COMMIT;
UPDATE dirty_table dt SET note = NULL
WHERE ROWID = (SELECT MIN(ROWID) FROM dirty_table st
                                 WHERE st.col1 = dt.col1
                                 AND ....);
If most rows do not have duplicates:
UPDATE dirty_table dt SET note = 'IS A DUPE' WHERE ROWID != (SELECT MIN(ROWID) FROM dirty_table st
                                  WHERE st.col1 = dt.col1
                                  AND ....);
This query structure could be very slow! Indexing will increase the update burden, but it may help, depending on your criteria for marking two rows as duplicates.
Again, unless you must preserve the duplicates in the same table, moving the distinct data to a table is the cleanest..
Good Luck!
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Chris Weiss mailto:chris_at_hpdbe.com www.hpdbe.com High Performance Database Engineering Available for long and short term contracts "R Chin" <rchin_at_panix.com> wrote in message news:ab6jjm$hqq$1_at_reader1.panix.com...Received on Tue May 07 2002 - 10:48:18 CDT
> I know the DML to remove duplicate records...
> but now I need to just flag them in the table (...set NOTE = 'A DUPE !')
>
> Someone has the DML to do this ?
> Thanks
>
> Rob
>
>
|  |  |