Home » SQL & PL/SQL » SQL & PL/SQL » remove duplicates in a query
remove duplicates in a query [message #218269] Wed, 07 February 2007 10:03 Go to next message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

I have a query I have created, how ever is has too many duplicates, like 50k, I added a DISTINCT to one of the variables and now its like 25k, what else can I use to remove all these duplicates it should have less than 1000 records. Heres my code. Thanks for any help.

select distinct(p.project_id), p.segment1, p.name, I.INVENTORY_ITEM_ID, I.SEGMENT1, M.PROJECT_ID, M.INVENTORY_ITEM_ID, M.PROJECT_NUMBER ,
a.task_id, a.project_id, A.TASK_NAME, A.TASK_NUMBER, D.PROJECT_ID, D.PROJECT_NUMBER, D.PROJECT_NAME
from SYSTEM_ITEMS I, SCHEDULE_ITEM_DATES_V M, pa.pa_projects_all p, pa.pa_tasks a , DISCRETE_JOBS_V D
where p.project_id = a.project_id and P.PROJECT_ID = D.PROJECT_ID
AND m.project_id = p.project_id and I.INVENTORY_ITEM_ID = M.INVENTORY_ITEM_ID and
p.project_id = '165003' AND P.SEGMENT1 = '1018101'
Re: remove duplicates in a query [message #218275 is a reply to message #218269] Wed, 07 February 2007 10:16 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
Learn to use GOOGLE!
Re: remove duplicates in a query [message #218292 is a reply to message #218275] Wed, 07 February 2007 11:18 Go to previous messageGo to next message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

Wow, thanks for your help. I didnt find anything useful on Google thats why I posted.
Re: remove duplicates in a query [message #218294 is a reply to message #218292] Wed, 07 February 2007 11:29 Go to previous messageGo to next message
ebrian
Messages: 2794
Registered: April 2006
Senior Member
Remove the parentheses around p.project_id.
Re: remove duplicates in a query [message #218298 is a reply to message #218294] Wed, 07 February 2007 11:48 Go to previous messageGo to next message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

Thanks for responding, but nothing changed,still have same results.
Re: remove duplicates in a query [message #218304 is a reply to message #218269] Wed, 07 February 2007 12:20 Go to previous messageGo to next message
BlackSwan
Messages: 25046
Registered: January 2009
Location: SoCal
Senior Member
http://asktom.oracle.com/pls/asktom/f?p=100:1:998126295531190::NO:RP::
Re: remove duplicates in a query [message #218315 is a reply to message #218298] Wed, 07 February 2007 13:16 Go to previous messageGo to next message
ebrian
Messages: 2794
Registered: April 2006
Senior Member
What are you considering dup? Are you indicating that entire rows are duplicated? My comment about removing the parentheses was tongue-in-cheek as you notice the appliction of distinct with or without should work for you.

SQL> select * from temp_dup;

        ID COL2                 COL3
---------- -------------------- --------------------
         1 A                    B
         1 A                    C
         1 A                    B
         2 A                    B

SQL> select distinct id, col2, col3 from temp_dup;

        ID COL2                 COL3
---------- -------------------- --------------------
         1 A                    B
         1 A                    C
         2 A                    B

SQL> select distinct(id), col2, col3 from temp_dup;

        ID COL2                 COL3
---------- -------------------- --------------------
         1 A                    B
         1 A                    C
         2 A                    B


Re: remove duplicates in a query [message #218327 is a reply to message #218315] Wed, 07 February 2007 14:57 Go to previous messageGo to next message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

yes entire rows are being duplicated but I rearranged my query so now I have about 517 recs which looks right, my only question now is how do I display all variables from the select statements the results only show variables from first select statement.

SELECT I.INVENTORY_ITEM_ID, M.PROJECT_ID, I.SEGMENT1, M.PROJECT_NUMBER
FROM SYSTEM_ITEMS I, SCHEDULE_ITEM_DATES M
WHERE I.INVENTORY_ITEM_ID = M.INVENTORY_ITEM_ID AND M.PROJECT_ID = 165003 --112 recs total union 516
UNION
select D.TASK_ID , D.PROJECT_ID, D.PROJECT_NAME,D.PROJECT_NUMBER
from pa.pa_projects_all p, DISCRETE_JOBS_V D
where P.PROJECT_ID = D.PROJECT_ID and p.project_id = 165003 --1 rec
UNION
select a.task_id, a.project_id, A.TASK_NAME, A.TASK_NUMBER
from pa.pa_projects_all p, pa.pa_tasks a
where p.project_id = a.project_id and p.project_id = 165003--459recs
ORDER BY 2
Re: remove duplicates in a query [message #218342 is a reply to message #218327] Wed, 07 February 2007 16:17 Go to previous messageGo to next message
ebrian
Messages: 2794
Registered: April 2006
Senior Member
I don't see how ENTIRE rows are being duplicated if you have the DISTINCT in your SELECT list. Try to add a GROUP BY to your query:

select p.project_id, p.segment1, p.name, I.INVENTORY_ITEM_ID, I.SEGMENT1, M.PROJECT_ID, M.INVENTORY_ITEM_ID, M.PROJECT_NUMBER ,
a.task_id, a.project_id, A.TASK_NAME, A.TASK_NUMBER, D.PROJECT_ID, D.PROJECT_NUMBER, D.PROJECT_NAME
from SYSTEM_ITEMS I, SCHEDULE_ITEM_DATES_V M, pa.pa_projects_all p, pa.pa_tasks a , DISCRETE_JOBS_V D
where p.project_id = a.project_id and P.PROJECT_ID = D.PROJECT_ID
AND m.project_id = p.project_id and I.INVENTORY_ITEM_ID = M.INVENTORY_ITEM_ID and
p.project_id = '165003' AND P.SEGMENT1 = '1018101'
GROUP BY p.project_id, p.segment1, p.name, I.INVENTORY_ITEM_ID, I.SEGMENT1, M.PROJECT_ID, M.INVENTORY_ITEM_ID, M.PROJECT_NUMBER ,
a.task_id, a.project_id, A.TASK_NAME, A.TASK_NUMBER, D.PROJECT_ID, D.PROJECT_NUMBER, D.PROJECT_NAME;
Re: remove duplicates in a query [message #218375 is a reply to message #218327] Wed, 07 February 2007 22:47 Go to previous messageGo to next message
flyboy
Messages: 1832
Registered: November 2006
Senior Member
Quote:

I have about 517 recs which looks right, my only question now is how do I display all variables from the select statements the results only show variables from first select statement.
--112 recs total union 516
--1 rec
--459recs


Do you mean the column description? You return it as a single resultset so how do you suppose it should be displayed?
If you want different column descriptions for each query, use them separately.
If you want to get all records WITH DUPLICATES among all result sets, use UNION ALL instead of UNION.
Re: remove duplicates in a query [message #218509 is a reply to message #218375] Thu, 08 February 2007 08:54 Go to previous messageGo to next message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

If I query each select statement separately that is the count I get
--112 recs total of all 3 is recs 516
--1 rec
--459recs

Now if I query my previous select statement before I inserted union I get results like this (see attachment)
and if I change it with the unions the info does change (ALSO SEE ATTACHMENT) I would like to know how to display each column name from each select statement instead of just from the first statement

  • Attachment: txtthurs.txt
    (Size: 1.43KB, Downloaded 106 times)
Re: remove duplicates in a query [message #218511 is a reply to message #218342] Thu, 08 February 2007 08:57 Go to previous messageGo to next message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

Still receive the same results when add group by see attachment for kind of results that I am getting
Re: remove duplicates in a query [message #218517 is a reply to message #218511] Thu, 08 February 2007 09:17 Go to previous messageGo to next message
ebrian
Messages: 2794
Registered: April 2006
Senior Member
OK...that's a horse of a different color and what my expectations of your output were. If you look across each row, they are in fact DISTINCT.

The problem in fact is that the output is just not the group of DISTINCT columns that you expect.

The easiest way to resolve this problem is by using analytics and ROW_NUMBER.
Re: remove duplicates in a query [message #218523 is a reply to message #218517] Thu, 08 February 2007 09:36 Go to previous message
deahayes3
Messages: 203
Registered: May 2006
Senior Member

Thanks everyone for your help but I think I figured out my problem, my where clause did not have info data and I was calling some of the columns (such as project_id) too many times, see below now my results are only 112 recs.

SELECT I.INVENTORY_ITEM_ID, M.PROJECT_ID, I.SEGMENT1,M.PROJECT_NUMBER,m.CONCATENATED_SEGMENTS, M.SCHEDULE_DATE, D.SCHEDULED_START_DATE,
D.WIP_ENTITY_NAME, D.SCHEDULED_COMPLETION_DATE, D.TASK_ID , D.PROJECT_NAME, A.TASK_NAME, A.TASK_NUMBER, A.START_DATE, A.COMPLETION_DATE
FROM SYSTEM_ITEMS I, SCHEDULE_ITEM_DATES M, pa.pa_projects_all p, DISCRETE_JOBS_V D, pa.pa_tasks a
WHERE I.INVENTORY_ITEM_ID = M.INVENTORY_ITEM_ID AND M.PROJECT_ID = p.project_id and P.PROJECT_ID = D.PROJECT_ID and p.project_id = a.project_id and p.project_id = 165003
and a.task_id = d.task_id --112 recs
Previous Topic: Query to build cursor in particular order
Next Topic: Preference: unique index or unique constraint?
Goto Forum:
  


Current Time: Thu Dec 08 06:12:27 CST 2016

Total time taken to generate the page: 0.12350 seconds