Re: sql to delete rows from joining a table to itself

From: Peter Nilsson <airia_at_acay.com.au>
Date: Mon, 29 Jun 2009 21:55:32 -0700 (PDT)
Message-ID: <ff7e760a-20d2-4491-9ff2-60d5fcdd6a1a_at_x3g2000yqa.googlegroups.com>



On Jun 29, 9:51 am, Jeff Calico <jeffcal..._at_gmail.com> wrote:
> Hello everyone
>
> I am trying to delete rows from a massive data table that
> are "almost" duplicates. I can do a nice query to find
> them when I join the table with itself, but I can't
> figure out how to form a delete statement out of this.
>
> Query:
>
> SELECT a.*
> FROM Trafficdata a, Trafficdata b
> WHERE a.id = b.id
> AND a.date = b.date

Sic?

> AND a.flag = 'Q'
> AND b.flag = 'Q'
> AND a.dstop = b.dstop
> AND (a.batch != b.batch OR (a.batch = b.batch AND
> a.pos_marker < b.pos_marker) );

Note that if there are rows with the same (id, date, flag = 'Q', dstop) tuple, but different batch numbers, then you'll get every row for that tuple.

Is that really what you want? Or do you only want to remove all but the the top pos_markers for every (id, date, flag = 'Q', dstop, batch) tuple? Or something else...?

Note also that this will potentially give you multiple copies of some rows from a.

> One idea I tried after searching the web was
> DELETE FROM ( <above query> ); but that gives
> ORA-01752 "Cannot delete from view without exactly
> one key preserved table"

Hint: A better select query might be...

  select a.*
    from trafficdata a
   where a.flag = 'Q'

     and exists
           (select 1
              from trafficdata b
             where b.id = a.id
               and b.date = a.date
               and b.flag = a.flag
               and b.dstop = a.dstop
               and (   b.batch != a.batch
                    or b.batch  = a.batch
                         and b.pos_marker > a.pos_marker));

--
Peter
Received on Mon Jun 29 2009 - 23:55:32 CDT

Original text of this message