Re: Question about deleting table duplicates
Date: 19 Sep 2003 06:17:40 -0700
Message-ID: <b863dacd.0309190517.500024df_at_posting.google.com>
"ScottH" <fakeaddress_at_newsgroupsonly.com> wrote in message news:<pOKdnbFXV7ane_SiU-KYvQ_at_giganews.com>...
> I was looking for thw SQL to delete dupes from a table, and came across
> this. All who saw it agreed in principle, but I can't quite figure out the
> logic. If we are deleting all rows whose rowid is greater than the least of
> the rowids returned from creating the subset of dupes, couldn't we
> inadvertently delete some non-dupes rows that were created after the last
> dupe ? I mean, any row created after the last dupe would have a greater
> rowid, wouldn't it ?
>
> Here's the SQL:
>
> delete from table_1 a
> where a.rowid >
> (select min(b.rowid)
> from table_1 b
> where b.col_dup_values = a.col_dup_values)
The condition in the subselect (b.col_dup_values = a.col_dup_values)
links the two instances (a and b) of table_1 in this SQL. The delete
statement, therefore, only affects table_1 with the condition:
(b.col_dup_values = a.col_dup_values).
This won't delete all duplicated rows, as such, but any row that is a
duplicate of a row that already exists - leaving one row where there
were several duplicates. If you wanted to do that the SQL is much
simpler (at least, simpler to follow).
DELETE FROM table_1
WHERE col_dup_values = (SELECT col_dup_values
,COUNT(*)
FROM table_1
WHERE COUNT(*)>1
GROUP BY col_dup_values)
Received on Fri Sep 19 2003 - 15:17:40 CEST