Re: Retrieval of duplicates, possible??

From: Mark D Powell <markp7832_at_my-deja.com>
Date: 2000/05/25
Message-ID: <8gjc2d$8rj$1_at_nnrp1.deja.com>#1/1


In article <u7snv7yqsc.fsf_at_o2-3.ebi.ac.uk>,   Philip Lijnzaad <lijnzaad_at_ebi.ac.uk> wrote:
> On Wed, 24 May 2000 18:26:48 GMT,
> "Al" == nbirvine <nbirvine_at_my-deja.com> writes:
>
> Al> Hello,
> Al> I have two DEEP tables (>3 millions rows each) called POLICY and
> Al> COMMISSION_POLICY and in order to fix a data problem, I need to
> Al> identify all the duplicate policy numbers in both tables.
>
> Al> By doing SELECT POLICY NUMBERS from POLICY UNION SELECT POLICY
 NUMBERS
> Al> from COMMISSION_POLICY, I will create a 3rd table containing the
 result
> Al> set common to both tables, thereby EXCLUDING duplicates.
>
> Al> I want to do the opposite: the result of my query should contain
> Al> nothing but duplicate policy numbers found in both tables.
 Logically,
> Al> it seems that I would need to use a MINUS operator of some sort.
>
> No: both MINUS and UNION are proper set operations, so they will
 exclude any
> duplicates. So { a, a, b, c } - { a, b, c} = { b, c}, which is not
 what you
> want.
>
> Al> FYI, I am using MS SQLServer 7.0.
>
> Let's hope it can do this:
>
> SELECT *
> FROM Policy
> WHERE policy_number IN
> (SELECT policy_number FROM Policy GROUP BY policy_number HAVING
 COUNT(*) > 1);
>
> This selects all the duplicate rows of table Policy; similar trick
 goes for
> the other one. To get rid of duplicates, you'll have to fall back on
 vendor
> specific stuff; usally, rows have an 'invisible' row identifier that
 you can
> use to distinguish between the two. If two rows are really identical,
> following will do the trick:
>
> DELETE From Policy as P
> WHERE P.rowid > (SELECT MIN(Q.rowid)
> FROM Policy as Q
> WHERE Q.policy_number = P.policy_number)
> -- AND further criteria.
>
> 'rowid' is the vendor(Oracle) specific 'invisible' row identifier.
 Don't
> forget to set up the appropriate primary key or uniqueness
 constraints when
> you're done.
>
> Cheers,
>
>
 Philip
>

I think this is a good solution and would probably work as well or better than what I want to mention. The original post mentioned union and minus. Isn't it asking for what the third set operator is designed to do, namely, intersect, which shows members common to both sets.

--
Mark D. Powell  -- The only advice that counts is the advice that
 you follow so follow your own advice --


Sent via Deja.com http://www.deja.com/
Before you buy.
Received on Thu May 25 2000 - 00:00:00 CEST

Original text of this message