Home » SQL & PL/SQL » SQL & PL/SQL » Selecting Partial Duplicate Records
Selecting Partial Duplicate Records [message #348731] Wed, 17 September 2008 16:15 Go to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
Hi.

I have a table which contains columns W, X, Y, Z. W is the primary key. If 2 records have the same Y and Z, those 2 records are duplicates. I need to select W from all duplicate records.

This query selects the correct records:

SELECT Y, Z, COUNT(*)
FROM my_table
GROUP BY (Y, Z)
HAVING COUNT(*) > 1

However, because the results are grouped, I can't see W. Any suggestions? Thanks in advance.

-A
Re: Selecting Partial Duplicate Records [message #348733 is a reply to message #348731] Wed, 17 September 2008 16:20 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
select a.w, b.w
from mytable a, mytable b
where ...
Re: Selecting Partial Duplicate Records [message #348736 is a reply to message #348733] Wed, 17 September 2008 16:27 Go to previous messageGo to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
anacedent wrote on Wed, 17 September 2008 17:20
select a.w, b.w
from mytable a, mytable b
where ...


Thanks for the reply, but I'm afraid I don't understand how this answers my question. What are 'a' and 'b' in your example, and why does that query return the results I want?
Re: Selecting Partial Duplicate Records [message #348738 is a reply to message #348731] Wed, 17 September 2008 16:39 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
http://www.orafaq.com/forum/t/88153/0/
Please read & follow posting guidelines as stated in URL above

> What are 'a' and 'b' in your example,
They are called table aliases.

>why does that query return the results I want?
Because it can & does with a correct WHERE clause
Re: Selecting Partial Duplicate Records [message #348743 is a reply to message #348738] Wed, 17 September 2008 17:22 Go to previous messageGo to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
anacedent wrote on Wed, 17 September 2008 17:39
http://www.orafaq.com/forum/t/88153/0/
Please read & follow posting guidelines as stated in URL above



Unless you're talking about using code tags (which you didn't do yourself) I don't understand what posting guidelines you think I'm violating.

Quote:

> What are 'a' and 'b' in your example,
They are called table aliases.



It's my understanding that the only purpose of an alias is to provide more readable code and results. I still have no idea what your code is actually supposed to accomplish.

Quote:

>why does that query return the results I want?
Because it can & does with a correct WHERE clause


This makes me think that you either don't understand my question, or you just don't know what you're talking about. I'm not trying to insult you, because I appreciate that you've spent time trying to help me, but I'm currently no closer to solving my problem than I was before I made my first post.
Re: Selecting Partial Duplicate Records [message #348744 is a reply to message #348731] Wed, 17 September 2008 17:30 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
> If 2 records have the same Y and Z, those 2 records are duplicates

WHERE A.Y = B.Y AND A.Z = B.Z
AND .....
Re: Selecting Partial Duplicate Records [message #348745 is a reply to message #348731] Wed, 17 September 2008 18:14 Go to previous messageGo to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
It looks to me like you're checking to see if a record is present in each of two references to the same table. This seems like it would result in every record being flagged as a duplicate. I'm not sure what I'm missing.
Re: Selecting Partial Duplicate Records [message #348746 is a reply to message #348731] Wed, 17 September 2008 18:35 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
>I'm not sure what I'm missing.
you are missing the criteria to compare X & Y on distinct/different rows.
Re: Selecting Partial Duplicate Records [message #348747 is a reply to message #348731] Wed, 17 September 2008 18:41 Go to previous messageGo to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
I'll check back tomorrow morning and see if this makes any more sense then. Hopefully it will.
Re: Selecting Partial Duplicate Records [message #348778 is a reply to message #348747] Wed, 17 September 2008 23:10 Go to previous messageGo to next message
navneet_sharma
Messages: 70
Registered: September 2008
Location: New Delhi, India
Member
Try this one, it should work in your case
select W from my_table a where rowid
>(select min(rowid) from my_table b where a.y=b.y and a.z=b.z
);
Re: Selecting Partial Duplicate Records [message #348783 is a reply to message #348731] Wed, 17 September 2008 23:24 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
alternatively

SELECT A.W, B.W
FROM MY_TABLE A, MY_TABLE B
WHERE A.Y = B.Y AND A.Z = B.Z
 AND  A.W <> B.W AND A.W > B.W


This will actually give the pair of PK values which are duplicates Y & Z values
Re: Selecting Partial Duplicate Records [message #348820 is a reply to message #348783] Thu, 18 September 2008 00:37 Go to previous messageGo to next message
harrysmall3
Messages: 109
Registered: April 2008
Location: Massachusetts
Senior Member
Anacedent, that's a great approach I had overlooked. Embarrasingly I stared at it for about 10 minutes wondering how it was going to get two rows; then drew out some table data side by side in excel, a couple minutes later still puzzled - and then registered the b.W next to the a.W! Embarassed

Showing the two dups in a one row pair is excellent readability.
Something tells me, given that W is a primary key, that this self join will out perform the WHERE EXISTS against the
GROUP BY sub-select approach.

Last confession as my eyes grow weary and I will enlighten myself in the morning by writing and executing the test -
I'm seeing the A.W <> B.W as redundant and A.W > B.W sufficient.
Again, my eyes fail me,...mine to test for the morning at the terminal!

Thanx,
Harry
Re: Selecting Partial Duplicate Records [message #348823 is a reply to message #348820] Thu, 18 September 2008 00:42 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
harrysmall3 wrote on Wed, 17 September 2008 22:37
Anacedent, that's a great approach I had overlooked. Embarrasingly I stared at it for about 10 minutes wondering how it was going to get two rows; then drew out some table data side by side in excel, a couple minutes later still puzzled - and then registered the b.W next to the a.W! Embarassed

Showing the two dups in a one row pair is excellent readability.
Something tells me, given that W is a primary key, that this self join will out perform the WHERE EXISTS against the
GROUP BY sub-select approach.

Last confession as my eyes grow weary and I will enlighten myself in the morning by writing and executing the test -
I'm seeing the A.W <> B.W as redundant and A.W > B.W sufficient.
Again, my eyes fail me,...mine to test for the morning at the terminal!

Thanx,
Harry

YES & thanks.
Re: Selecting Partial Duplicate Records [message #348923 is a reply to message #348731] Thu, 18 September 2008 05:39 Go to previous messageGo to next message
Alessandro Rossi
Messages: 166
Registered: September 2008
Location: Rome
Senior Member
just use analytics...

Not tested
select *
from (
		SELECT a.*,COUNT(*) over (partition by Y, Z) as cnt
		FROM my_table a
	)
where cnt > 1


Bye Alessandro
Re: Selecting Partial Duplicate Records [message #349012 is a reply to message #348783] Thu, 18 September 2008 10:00 Go to previous messageGo to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
anacedent wrote on Thu, 18 September 2008 00:24
alternatively

SELECT A.W, B.W
FROM MY_TABLE A, MY_TABLE B
WHERE A.Y = B.Y AND A.Z = B.Z
 AND  A.W <> B.W AND A.W > B.W


This will actually give the pair of PK values which are duplicates Y & Z values


This returns exactly what I need, and now I even understand why. Thanks for your patience.

Alessandro Rossi wrote on Thu, 18 September 2008 06:39
just use analytics...

Not tested
select *
from (
		SELECT a.*,COUNT(*) over (partition by Y, Z) as cnt
		FROM my_table a
	)
where cnt > 1


Bye Alessandro


I think this works as well, but I haven't figured out why yet. I'll have to see which one runs more quickly. Thanks again for your help, everyone.
Re: Selecting Partial Duplicate Records [message #349023 is a reply to message #348743] Thu, 18 September 2008 10:31 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
aadwight wrote on Wed, 17 September 2008 15:22

This makes me think that you either don't understand my question, or you just don't know what you're talking about.



neither of the above is/was true

you don't know, what you don't know.

[Updated on: Thu, 18 September 2008 10:33] by Moderator

Report message to a moderator

Re: Selecting Partial Duplicate Records [message #349028 is a reply to message #349023] Thu, 18 September 2008 10:53 Go to previous messageGo to next message
aadwight
Messages: 9
Registered: September 2008
Junior Member
anacedent wrote on Thu, 18 September 2008 11:31


neither of the above is/was true

you don't know, what you don't know.


I agree with both of these statements. I don't have much practice with SQL, and probably know twice as much today as I knew yesterday, thanks to this forum. I was unsure if you understood my question because I did not understand your answer. I appreciate your help, and I apologize if I offended.
Re: Selecting Partial Duplicate Records [message #349150 is a reply to message #349028] Fri, 19 September 2008 02:13 Go to previous message
Alessandro Rossi
Messages: 166
Registered: September 2008
Location: Rome
Senior Member
The technique I used takes advantage of a special Oracle feature well explained here.

I don't know your understandings of SQL so I preferred to point you to a manual where things are explained in the right order.

If you need further documentation you may find here what you need.

The alternative presented by anacedent uses a standard join documented here and deeper here.

Bye Alessandro
Previous Topic: Escaping single inverted comma (') in oracle ststement
Next Topic: sql in sql
Goto Forum:
  


Current Time: Wed Dec 07 09:08:26 CST 2016

Total time taken to generate the page: 0.09002 seconds