Re: sql to delete rows from joining a table to itself

From: Shakespeare <whatsin_at_xs4all.nl>
Date: Tue, 30 Jun 2009 07:25:24 +0200
Message-ID: <4a49a1cd$0$198$e4fe514c_at_news.xs4all.nl>



Peter Nilsson schreef:
> On Jun 29, 9:51 am, Jeff Calico <jeffcal..._at_gmail.com> wrote:
>> Hello everyone
>>
>> I am trying to delete rows from a massive data table that
>> are "almost" duplicates. I can do a nice query to find
>> them when I join the table with itself, but I can't
>> figure out how to form a delete statement out of this.
>>
>> Query:
>>
>> SELECT a.*
>> FROM Trafficdata a, Trafficdata b
>> WHERE a.id = b.id
>> AND a.date = b.date

>
> Sic?
>
>> AND a.flag = 'Q'
>> AND b.flag = 'Q'
>> AND a.dstop = b.dstop
>> AND (a.batch != b.batch OR (a.batch = b.batch AND
>> a.pos_marker < b.pos_marker) );

>
> Note that if there are rows with the same (id, date,
> flag = 'Q', dstop) tuple, but different batch numbers,
> then you'll get every row for that tuple.
>
> Is that really what you want? Or do you only want to
> remove all but the the top pos_markers for every (id,
> date, flag = 'Q', dstop, batch) tuple? Or something
> else...?
>
> Note also that this will potentially give you multiple
> copies of some rows from a.
>
>> One idea I tried after searching the web was
>> DELETE FROM ( <above query> ); but that gives
>> ORA-01752 "Cannot delete from view without exactly
>> one key preserved table"

>
> Hint: A better select query might be...
>
> select a.*
> from trafficdata a
> where a.flag = 'Q'
> and exists
> (select 1
> from trafficdata b
> where b.id = a.id
> and b.date = a.date
> and b.flag = a.flag
> and b.dstop = a.dstop
> and ( b.batch != a.batch
> or b.batch = a.batch
> and b.pos_marker > a.pos_marker));
>
> --
> Peter

I took a quick look, but if there are corresponding rows on (id, date, flag, dstop) where a.batch <> b.batch it looks like BOTH rows will be deleted. I don't know if that is what you really wanted...

Shakespeare Received on Tue Jun 30 2009 - 00:25:24 CDT

Original text of this message