Re: sql to delete rows from joining a table to itself
From: Shakespeare <whatsin_at_xs4all.nl>
Date: Tue, 30 Jun 2009 17:00:45 +0200
Message-ID: <4a4a28a7$0$188$e4fe514c_at_news.xs4all.nl>
Mark D Powell schreef:
>> On Jun 28, 7:51 pm, Jeff Calico <jeffcal..._at_gmail.com> wrote:
>>
>>
>>
>>
>>
>>> Hello everyone
>>> I am trying to delete rows from a massive data table that are "almost"
>>> duplicates. I can do a nice query
>>> to find them when I join the table with itself, but I can't figure out
>>> how to form a delete statement out
>>> of this.
>>> Query:
>>> SELECT a.*
>>> FROM Trafficdata a, Trafficdata b
>>> WHERE a.id = b.id
>>> AND a.date = b.date
>>> AND a.flag = 'Q'
>>> AND b.flag = 'Q'
>>> AND a.dstop = b.dstop
>>> AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker <
>>> b.pos_marker) );
>>> One idea I tried after searching the web was
>>> DELETE FROM ( <above query> ); but that gives
>>> ORA-01752 "Cannot delete from view without exactly one key preserved
>>> table"
>>> Any help would be greatly appreciated!
>>> Jeff C.
>> Do you have a primary key on this Trafficdata table?
>> Assuming your results are the rows your really want to delete then a
>> delete of the form:
>>
>> DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ;
>>
>> I suggest you do this AFTER doing a backup.
>>
>> Ed- Hide quoted text -
>>
>> - Show quoted text -
Date: Tue, 30 Jun 2009 17:00:45 +0200
Message-ID: <4a4a28a7$0$188$e4fe514c_at_news.xs4all.nl>
Mark D Powell schreef:
> On Jun 29, 8:03 am, Ed Prochak <edproc..._at_gmail.com> wrote:
>> On Jun 28, 7:51 pm, Jeff Calico <jeffcal..._at_gmail.com> wrote:
>>
>>
>>
>>
>>
>>> Hello everyone
>>> I am trying to delete rows from a massive data table that are "almost"
>>> duplicates. I can do a nice query
>>> to find them when I join the table with itself, but I can't figure out
>>> how to form a delete statement out
>>> of this.
>>> Query:
>>> SELECT a.*
>>> FROM Trafficdata a, Trafficdata b
>>> WHERE a.id = b.id
>>> AND a.date = b.date
>>> AND a.flag = 'Q'
>>> AND b.flag = 'Q'
>>> AND a.dstop = b.dstop
>>> AND (a.batch != b.batch OR (a.batch = b.batch AND a.pos_marker <
>>> b.pos_marker) );
>>> One idea I tried after searching the web was
>>> DELETE FROM ( <above query> ); but that gives
>>> ORA-01752 "Cannot delete from view without exactly one key preserved
>>> table"
>>> Any help would be greatly appreciated!
>>> Jeff C.
>> Do you have a primary key on this Trafficdata table?
>> Assuming your results are the rows your really want to delete then a
>> delete of the form:
>>
>> DELETE FROM Trafficdata WHERE pk IN (SELECT a.pk from ...) ;
>>
>> I suggest you do this AFTER doing a backup.
>>
>> Ed- Hide quoted text -
>>
>> - Show quoted text -
> > Assuming you want to keep one copy of each row the following will work > > delete from table_name > where (key_list, rowid) in > ( select keys, rowid from table_name > minus > select keys, min(rowid) from table_name > group by keys > ) > / > > HTH -- Mark D Powell --
Looks like you are skipping the condition that the record with the largest pos_marker should be preserved, not the lowest rowid. Furthermore, if a.batch != b.batch, both records would be deleted in the original query (if it worked).
And is there no mutating table problem here? Won't this lead to 'snapshot too old' problems with large tables? (I'm not sure about that)
Shakespeare Received on Tue Jun 30 2009 - 10:00:45 CDT