Re: Duplicate Rows

From: Serge Rielau <srielau_at_ca.ibm.com>
Date: Sat, 26 Jul 2008 09:46:05 -0400
Message-ID: <6f0o51F99lkdU1@mid.individual.net>


David Portas wrote:

> <artmerar_at_yahoo.com> wrote in message 
> news:67fb7670-5142-440f-9cc3-33c9ee914ebc_at_c65g2000hsa.googlegroups.com...

>> Hi,
>>
>> I figured I'd ask this too. I know how to remove duplicate rows from
>> a table, but this example used an analytical function. I'm new to
>> these and do not get a huge chance to use them, so I'm always
>> forgetting how they work.
>>
>> Please see this:
>>
>> select * from customer
>> where rowid in
>> (select rowid from
>> (select
>> rowid,
>> row_number()
>> over
>> (partition by customer_id order by customer_id ) dup
>> from customer)
>> where dup > 1);
>>
>> Ok, that did show me the duplicates, but what if a duplicate was made
>> up of more than 1 column? Can I still use an analytical function?
>>
>> This is where I might learn about these things......
>>
>> Thank you.
> 
> The OVER clause accepts a list of columns:
> 
>   SELECT * FROM
>    (SELECT customer_id, col2, col3,
>      ROW_NUMBER()
>     OVER
>      (PARTITION BY customer_id, col2, col3
>       ORDER BY customer_id, col2, col3 ) dup
>     FROM customer) t
>   WHERE dup > 1;
> 

The ORDER BY does nothing here.
You use the ORDER BY if there is a specific row that you want to keep. (Rows are the same except that some are better... ) Let's say the "newest": ROW_NUMBER() OVER (PARTITION BY itemid ORDER BY updatestamp DESC) to give you the last version of a row.

Cheers
Serge

-- 
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Received on Sat Jul 26 2008 - 08:46:05 CDT

Original text of this message