Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Re: "Seek and destroy" duplicate entries
Alex wrote:
> Hello NG,
>
> considering the following table-structure (import-table):
>
> Material_Nr || Alternative_Hierarchy
> -------------||--------------------
> A || 10xalpha
> A || 10xalpha
> A || 10xalpha
> A || 10xalpha
> B || 20xgamma
> B || 20xgamma
> B || 20xgamma
> B || 20xgamma
> ...
>
> The data I've to import isn't normalized. Therefore, the original
> datasets are inflated to nearly 140.000.
> While I'm able to delete about 130.000 datasets by killing those, who
> have NULL as "Alternative_Hierarchy", I've still got >8.000 datasets
> left.
>
> But a lot of these are still duplicates!
> Having only the mentioned columns, I failed to apply the method of
> S-A-D using the MIN()/MAX() functions.
>
> Can anyone provide me with a solution of killing the duplicates (except
> one) ?
> Any help will be appreciated!
>
>
> Alex Sauer
The following should retrieve one row per MATERIAL_NR,
ALTERNATIVE_HIERARCHY combination:
SELECT
MATERIAL_NR,
ALTERNATIVE_HIERARCHY,
MIN(ROWID) ROW_ID
FROM
MY_TABLE
GROUP BY
MATERIAL_NR,
ALTERNATIVE_HIERARCHY;
Using the above as a starting point:
SELECT
MATERIAL_NR,
ALTERNATIVE_HIERARCHY,
ROWID ROW_ID
FROM
MY_TABLE
MINUS
SELECT
MATERIAL_NR,
ALTERNATIVE_HIERARCHY,
MIN(ROWID) ROW_ID
FROM
MY_TABLE
GROUP BY
MATERIAL_NR,
ALTERNATIVE_HIERARCHY;
The above SQL statement retrieves the MATERIAL_NR,
ALTERNATIVE_HIERARCHY of all rows, and then eliminates all of those
that were contained in the original query. Now to make use of the
results:
DELETE FROM
MY_TABLE
WHERE
(MATERIAL_NR,
ALTERNATIVE_HIERARCHY,
ROWID)
IN
(SELECT
MATERIAL_NR,
ALTERNATIVE_HIERARCHY,
ROWID ROW_ID
FROM
MY_TABLE
MINUS
SELECT
MATERIAL_NR,
ALTERNATIVE_HIERARCHY,
MIN(ROWID) ROW_ID
FROM
MY_TABLE
GROUP BY
MATERIAL_NR,
ALTERNATIVE_HIERARCHY);
Obviously, test the results before executing the DELETE.
Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc.
Received on Mon Oct 16 2006 - 07:42:52 CDT