Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.misc -> Re: "Seek and destroy" duplicate entries
Charles Hooper schrieb:
> Alex wrote:
> > Hello NG,
> >
> > considering the following table-structure (import-table):
> >
> > Material_Nr || Alternative_Hierarchy
> > -------------||--------------------
> > A || 10xalpha
> > A || 10xalpha
> > A || 10xalpha
> > A || 10xalpha
> > B || 20xgamma
> > B || 20xgamma
> > B || 20xgamma
> > B || 20xgamma
> > ...
> >
> > The data I've to import isn't normalized. Therefore, the original
> > datasets are inflated to nearly 140.000.
> > While I'm able to delete about 130.000 datasets by killing those, who
> > have NULL as "Alternative_Hierarchy", I've still got >8.000 datasets
> > left.
> >
> > But a lot of these are still duplicates!
> > Having only the mentioned columns, I failed to apply the method of
> > S-A-D using the MIN()/MAX() functions.
> >
> > Can anyone provide me with a solution of killing the duplicates (except
> > one) ?
> > Any help will be appreciated!
> >
> >
> > Alex Sauer
>
> The following should retrieve one row per MATERIAL_NR,
> ALTERNATIVE_HIERARCHY combination:
> SELECT
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY,
> MIN(ROWID) ROW_ID
> FROM
> MY_TABLE
> GROUP BY
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY;
>
> Using the above as a starting point:
> SELECT
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY,
> ROWID ROW_ID
> FROM
> MY_TABLE
> MINUS
> SELECT
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY,
> MIN(ROWID) ROW_ID
> FROM
> MY_TABLE
> GROUP BY
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY;
>
> The above SQL statement retrieves the MATERIAL_NR,
> ALTERNATIVE_HIERARCHY of all rows, and then eliminates all of those
> that were contained in the original query. Now to make use of the
> results:
> DELETE FROM
> MY_TABLE
> WHERE
> (MATERIAL_NR,
> ALTERNATIVE_HIERARCHY,
> ROWID)
> IN
> (SELECT
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY,
> ROWID ROW_ID
> FROM
> MY_TABLE
> MINUS
> SELECT
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY,
> MIN(ROWID) ROW_ID
> FROM
> MY_TABLE
> GROUP BY
> MATERIAL_NR,
> ALTERNATIVE_HIERARCHY);
>
> Obviously, test the results before executing the DELETE.
>
> Charles Hooper
> PC Support Specialist
> K&M Machine-Fabricating, Inc.
Thank you! I'll try that solution.
Can you tell me which one is speedier? "SELECT... MINUS SELECT ..." or
"SELECT ... WHERE ... NOT IN (SELECT...)"?
THX
Alex Sauer
Received on Mon Oct 16 2006 - 07:57:35 CDT
![]() |
![]() |