Home » SQL & PL/SQL » SQL & PL/SQL » De-Dupe using Select
icon9.gif  De-Dupe using Select [message #203827] Thu, 16 November 2006 08:27 Go to next message
Kev7481
Messages: 5
Registered: November 2006
Junior Member
Hi,

I am trying to to write some SQL to select duplicate records. I am running the follwoing SQL but I eblieve my problem is caused by the fact that I am de-duping the table against itself. I keep getting records returned twice just in a different order - see below. How can I change the select to only return 1 row rather than 2.

Unfortunately I only have select access, I do not access to create tables or delete data through SQL.

SQL STatement so far:-

select a.add_num, a.con_num, m.mem_type as Mem_Con1,
b.add_num, b.con_num, m1.mem_type as Mem_Con2
from people a left outer join memberships m
on a.con_num = m.con_num
and m.cancelled_on is null,
people b left outer join mem m1
on b.con_num = m1.con_num
and m1.cancelled_on is null
where a.con_num != b.con_num
and a.title = b.title
and a.forenames = b.forenames
and a.surname = b.surname
and a.add_num = b.add_num

Results:-

A.ADD_NUM A.CON_NUM M.MEM_CON1 B.ADD_NUM B.CON_NUM M1.MEM_CON2
94 2685823 94 2685829 A
94 2685829 A 94 2685823


Desired Result:-
A.ADD_NUM A.CON_NUM M.MEM_CON1 B.ADD_NUM B.CON_NUM M1.MEM_CON1
94 2685823 94 2685829 A


Any help would be greatly appreciated.

Thanks,
Kev
Re: De-Dupe using Select [message #203828 is a reply to message #203827] Thu, 16 November 2006 08:36 Go to previous messageGo to next message
Cthulhu
Messages: 381
Registered: September 2006
Location: UK
Senior Member
Try:

select a.add_num, a.con_num, m.mem_type as Mem_Con1,
b.add_num, b.con_num, m1.mem_type as Mem_Con2
from people a left outer join memberships m
on a.con_num = m.con_num
and m.cancelled_on is null,
people b left outer join mem m1
on b.con_num = m1.con_num
and m1.cancelled_on is null
where a.con_num != b.con_num
and a.title = b.title
and a.forenames = b.forenames
and a.surname = b.surname
and a.add_num = b.add_num
and a.rowid <> b.rowid
Re: De-Dupe using Select [message #203830 is a reply to message #203828] Thu, 16 November 2006 08:45 Go to previous messageGo to next message
Kev7481
Messages: 5
Registered: November 2006
Junior Member
Hi,

Thanks for your reply, unfortunately it returns the same result.

I did some playing with Rowid's (a.rowid||b.rowid <> b.rowid||a.rowid) but could not get it to work. Not sure if it is becuase I am returning 2 rowid's??

Kev
Re: De-Dupe using Select [message #203834 is a reply to message #203827] Thu, 16 November 2006 09:06 Go to previous messageGo to next message
Cthulhu
Messages: 381
Registered: September 2006
Location: UK
Senior Member
Looking at your output, the two records are different, so it is probably a valid result. Are you saying you only want to see one lot of membership info for each person ? If so, how do you want to select it ?
Re: De-Dupe using Select [message #203837 is a reply to message #203827] Thu, 16 November 2006 09:17 Go to previous messageGo to next message
Cthulhu
Messages: 381
Registered: September 2006
Location: UK
Senior Member
I'm a bit confused by the whole mission here. Where are the duplicates that you are trying to find ? Are they in People or memberships ?

The problem is that you have effectively an expanded table people_memberships (a join of people and members) joined to another expanded table people_memberships2 (a join of people and mem2). The key of both these tables will clearly be people_key + membership key. However, you are attempting to join the two tables on only the people key fields (surname etc). This is bound to lead to cartesian products and hence duplicates.

If the duplicates are in the people table, you should do a query which de-duplicates that before joining it to memberships.
Re: De-Dupe using Select [message #204002 is a reply to message #203827] Fri, 17 November 2006 05:15 Go to previous messageGo to next message
nachiketa.iyengar
Messages: 5
Registered: November 2006
Location: INDIA
Junior Member

Hey is there any probs in using distinct????? I was not able to go thru ur query completely but i didn find the keyword distinct....excuse me if it has been irrelevant Smile

Re: De-Dupe using Select [message #204061 is a reply to message #203827] Fri, 17 November 2006 09:27 Go to previous messageGo to next message
xxxyyyxxx
Messages: 2
Registered: November 2006
Junior Member
Why don't you simply try to change the condition

a.con_num != b.con_num

to

a.con_num < b.con_num

to avoid having each result "duplicated"
Re: De-Dupe using Select [message #204626 is a reply to message #204002] Tue, 21 November 2006 07:43 Go to previous messageGo to next message
Kev7481
Messages: 5
Registered: November 2006
Junior Member
Unfortunately distinct does not work - however xxxyyyxxx response to say change a.con_num != b.con_num to a.con_num < b.con_num has worked.

Thanks
Re: De-Dupe using Select [message #204629 is a reply to message #204061] Tue, 21 November 2006 07:44 Go to previous messageGo to next message
Kev7481
Messages: 5
Registered: November 2006
Junior Member
That worked perfectly - thanks for your help.

Kev
Re: De-Dupe using Select [message #204632 is a reply to message #203837] Tue, 21 November 2006 07:49 Go to previous message
Kev7481
Messages: 5
Registered: November 2006
Junior Member
apologies for taking so long to reply.

Changing a.con_num != b.con_num to a.con_num < b.con_num worked perfectly as suggested by xxxyyyxxx.

I have duplicate records in my DB and I need to select all the records so I can organise a merge process - to do this I had to link the table to itself and this is where I kept getting duplciate rows just the other way round e.g.

Line1
Con1 = 12345
con2 = 4567

Line2
con1 = 4567
Con2 = 12345

They are not duplicates but I only wanted to get line 1 or 2

Thansk for al lyour comment - much appreaciated.
Previous Topic: Order By Issue....
Next Topic: how to create dynamic table
Goto Forum:
  


Current Time: Thu Dec 08 18:28:19 CST 2016

Total time taken to generate the page: 0.14091 seconds