Re: sql to delete rows from joining a table to itself
Date: Mon, 29 Jun 2009 21:55:32 -0700 (PDT)
Message-ID: <ff7e760a-20d2-4491-9ff2-60d5fcdd6a1a_at_x3g2000yqa.googlegroups.com>
On Jun 29, 9:51 am, Jeff Calico <jeffcal..._at_gmail.com> wrote:
> Hello everyone
>
> I am trying to delete rows from a massive data table that
> are "almost" duplicates. I can do a nice query to find
> them when I join the table with itself, but I can't
> figure out how to form a delete statement out of this.
>
> Query:
>
> SELECT a.*
> FROM Trafficdata a, Trafficdata b
> WHERE a.id = b.id
> AND a.date = b.date
Sic?
> AND a.flag = 'Q'
> AND b.flag = 'Q'
> AND a.dstop = b.dstop
> AND (a.batch != b.batch OR (a.batch = b.batch AND
> a.pos_marker < b.pos_marker) );
Note that if there are rows with the same (id, date, flag = 'Q', dstop) tuple, but different batch numbers, then you'll get every row for that tuple.
Is that really what you want? Or do you only want to remove all but the the top pos_markers for every (id, date, flag = 'Q', dstop, batch) tuple? Or something else...?
Note also that this will potentially give you multiple copies of some rows from a.
> One idea I tried after searching the web was
> DELETE FROM ( <above query> ); but that gives
> ORA-01752 "Cannot delete from view without exactly
> one key preserved table"
Hint: A better select query might be...
select a.*
from trafficdata a
where a.flag = 'Q'
and exists (select 1 from trafficdata b where b.id = a.id and b.date = a.date and b.flag = a.flag and b.dstop = a.dstop and ( b.batch != a.batch or b.batch = a.batch and b.pos_marker > a.pos_marker));
-- PeterReceived on Mon Jun 29 2009 - 23:55:32 CDT