Home » SQL & PL/SQL » SQL & PL/SQL » Removing Duplicate Rows when condition is matched (Oracle 9i)
Removing Duplicate Rows when condition is matched [message #447676] Wed, 17 March 2010 05:31 Go to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
Hi, My requirement if id, join_date, join_time, result of table1 is matched with table2 at least one time then if repeating rows associated with the id should not come.

Here is the test case.

create table table1
( id number , join_date varchar2(8), join_time varchar2(6), status varchar2(10));


create table table2
( id number , join_date varchar2(8), join_time varchar2(6), status varchar2(10));

insert into table1 values (01, '20010101', '0500', 'PASS');
insert into table1 values (01, '20010102', '0501', 'FAIL');
insert into table1 values (02, '20010103', '0502', 'PASS');
insert into table1 values (03, '20010104', '0503', 'FAIL');
insert into table1 values (04, '20010105', '0504', 'PASS');
insert into table1 values (05, '20010106', '0505', 'FAIL');


insert into table2 values (01, '20010101', '0500', 'PASS');
insert into table2 values (01, '20010102', '0501', 'PASS');
insert into table2 values (02, '20010102', '0501', 'PASS');


I have tried the below mentioned query, please let me know whether any better query is there 
than this because in real-time data have 2 millions of record in table 1 and 60 thousand in table2. 
Thanks in Advance.

select distinct a.id, a.join_date, a.join_time, a.status
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) not in (select b.id, b.join_date, b.join_time, b.status
from table2 b)
and  a.id = (
select distinct  a.id
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) not in (select b.id, b.join_date, b.join_time, b.status
from table2 b)
minus
select distinct  a.id
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) in (select b.id, b.join_date, b.join_time, b.status
from table2 b));

OUTPUT

ID	JOIN_DATE	JOIN_TIME	STATUS
--       --------        -------        -----
2	20010103	  0502	        PASS

[Updated on: Wed, 17 March 2010 05:37] by Moderator

Report message to a moderator

Re: Removing Duplicate Rows when condition is matched [message #447677 is a reply to message #447676] Wed, 17 March 2010 05:39 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
My requirement if id, join_date, join_time, result of table1 is matched with table2 at least one time then if repeating rows associated with the id should not come.

And which one do you want to keep?

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447679 is a reply to message #447677] Wed, 17 March 2010 05:42 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
I want to keep table1 data.
Re: Removing Duplicate Rows when condition is matched [message #447680 is a reply to message #447679] Wed, 17 March 2010 05:44 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
In which condition you tell them as duplicates, and which one you want to keep if they are duplicates (and duplicate condition is not on all columns).

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447681 is a reply to message #447676] Wed, 17 March 2010 05:53 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
--Table 1 Data

ID	JOIN_DATE	JOIN_TIME	STATUS	  
---------------------------------------------- 
1	20010101	0500	       PASS	   
1	20010102	0501	       FAIL	   
2	20010103	0502	       PASS	   
3	20010104	0503	       FAIL	   
4	20010105	0504	       PASS	   
5	20010106	0505	       FAIL	

--Table 2 Data

 
ID	JOIN_DATE	JOIN_TIME	STATUS	   
-------------------------------------------------
1	20010101	0500	       PASS	   
1	20010102	0501	       PASS	   
2	20010102	0501	       PASS	

--DATA MATCHING IS WITH TABLE 2

ID	JOIN_DATE	JOIN_TIME	STATUS	   
-------------------------------------------------
1	20010101	0500	       PASS 

--BUT ALSO ID 1 HAS THIS DUPLICATE DATA 

ID	JOIN_DATE	JOIN_TIME	STATUS	   
-------------------------------------------------
1	20010102	0501	       FAIL

-- I don't want this duplicate data, here duplicate means ID.

--Final output expected is 

ID	JOIN_DATE	JOIN_TIME	STATUS
--       --------        -------        -----
2	20010103	  0502	        PASS


Re: Removing Duplicate Rows when condition is matched [message #447686 is a reply to message #447681] Wed, 17 March 2010 06:30 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
How this
ID	JOIN_DATE	JOIN_TIME	STATUS	   
-------------------------------------------------
1	20010101	0500	       PASS 

is a duplicate of that
ID	JOIN_DATE	JOIN_TIME	STATUS	   
-------------------------------------------------
1	20010102	0501	       FAIL


And why do you want only this
ID	JOIN_DATE	JOIN_TIME	STATUS
--       --------        -------        -----
2	20010103	  0502	        PASS

Who has no relation with the above?
And has no relation with your requirement "if id, join_date, join_time, result of table1 is matched with table2" as (id,join_date,join_time) of
2	20010102	0501	       PASS	

in table2 does not match any of those in table1.

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447689 is a reply to message #447676] Wed, 17 March 2010 06:52 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
What the query I have written is satisfying my requirement, only thing I would like to know that is there any better query or alternate way where the performance will be good? Thanks

select distinct a.id, a.join_date, a.join_time, a.status
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) not in (select b.id, b.join_date, b.join_time, b.status
from table2 b)
and  a.id = (
select distinct  a.id
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) not in (select b.id, b.join_date, b.join_time, b.status
from table2 b)
minus
select distinct  a.id
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) in (select b.id, b.join_date, b.join_time, b.status
from table2 b));

Re: Removing Duplicate Rows when condition is matched [message #447690 is a reply to message #447689] Wed, 17 March 2010 07:07 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I don't care of your query, I want the requirements.
Just answer my questions there are simple ones.

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447695 is a reply to message #447676] Wed, 17 March 2010 07:28 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
--Relation is ID


--And why do you want only this? because condition is satisfied at least one time it should match


--DATA MATCHING IS WITH TABLE 2

ID JOIN_DATE JOIN_TIME STATUS
-------------------------------------------------
1 20010101 0500 PASS

----DATA NOT MATCHING IS WITH TABLE 2
ID JOIN_DATE JOIN_TIME STATUS
-- -------- ------- -----
2 20010103 0502 PASS
Re: Removing Duplicate Rows when condition is matched [message #447700 is a reply to message #447676] Wed, 17 March 2010 07:39 Go to previous messageGo to next message
Yasir Hashmi
Messages: 304
Registered: April 2006
Senior Member
misinterpreted

Your requiremnets are still not clear.

Why in the result there is only ID 3? Why not 4,5?

[Updated on: Wed, 17 March 2010 07:56]

Report message to a moderator

Re: Removing Duplicate Rows when condition is matched [message #447701 is a reply to message #447695] Wed, 17 March 2010 07:43 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
--Final output expected is 

ID	JOIN_DATE	JOIN_TIME	STATUS
--       --------        -------        -----
2	20010103	  0502	        PASS

But this one does not match to any rows in table2 against the condition (id,join_date,join_time).
Or are you requirements wrong and it is only on ID?
And why any row with ID=1 does not come when there is "at least one" match?

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447704 is a reply to message #447676] Wed, 17 March 2010 07:47 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
My requirement is if id, join_date, join_time, result of table1 is matched with table2 at least one time then if ID is again repeating more than once then those rows to be excluded. Here relation is ID. If you see my test case from the beginning i think you will understand

Suppose ID = 1 with join_date, join_time, result is matched at least for one time with the table2, then rest of the rows of ID = 1 should not come in the output because as per the requirement it has matched the values at least once in the Table2. Hope you now understand my requirement. Thanks
Re: Removing Duplicate Rows when condition is matched [message #447705 is a reply to message #447704] Wed, 17 March 2010 07:52 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
If you see my test case from the beginning i think you will understand

An example is not a clear requirement and may be wrong.

Your (new) requirements explain why there is not ID=1 in your result but still do not answer why it contains the one ID= 2 it has no matching in table2 and so it is not "at least one".

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447710 is a reply to message #447676] Wed, 17 March 2010 08:00 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
--Values which are matching with Table1 query and output

select distinct a.id, a.join_date, a.join_time, a.status
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status)  in (select b.id, b.join_date, b.join_time, b.status
from table2);

ID	JOIN_DATE	JOIN_TIME	STATUS
-----------------------------------------------
1	20010101	500	PASS

--Values which are not matching with Table1 query and output

select distinct a.id, a.join_date, a.join_time, a.status
from table1 a,  table2 b
where a.id = b.id
and  (a.id, a.join_date, a.join_time, a.status) not in (select b.id, b.join_date, b.join_time, b.status
from table2 b);

ID	JOIN_DATE	JOIN_TIME	STATUS
----------------------------------------------
1	20010102	501	FAIL
2	20010103	502	PASS

--Here ID is repeating, even after matching with at least one 
time with Table1 because of two records of ID = 1			
ID	JOIN_DATE	JOIN_TIME	STATUS
----------------------------------------------
1	20010102	501	FAIL

--But actually required is ID = 2 as this is not matching with 
Table1 and doesnot contains multiple records of ID = 2.			
ID	JOIN_DATE	JOIN_TIME	STATUS
-----------------------------------------------
2	20010102	0501	PASS

[Updated on: Wed, 17 March 2010 09:08] by Moderator

Report message to a moderator

Re: Removing Duplicate Rows when condition is matched [message #447712 is a reply to message #447676] Wed, 17 March 2010 08:11 Go to previous messageGo to next message
Yasir Hashmi
Messages: 304
Registered: April 2006
Senior Member
Based on whatever you have explained,this comes to my mind:

select * from (
    select * from table1 where (id,join_date,join_time) not in
      (select id,join_date,join_time from table2))
where id in(select id from table2)

Re: Removing Duplicate Rows when condition is matched [message #447717 is a reply to message #447676] Wed, 17 March 2010 09:11 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
But the query will not fetch the data if ID is present in Table1 and ID is missing in Table2.

Final requirement is if ID is missing in Table2 and also not matching the values id,join_date,join_time in Table2.
Re: Removing Duplicate Rows when condition is matched [message #447719 is a reply to message #447710] Wed, 17 March 2010 09:26 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
SQL> select t1.id, t1.join_date, t1.join_time, t1.status
  2  from table1 t1, table2 t2, table2 t3
  3  where t2.id = t1.id
  4    and t3.id (+) = t1.id
  5    and t3.join_date (+) = t1.join_date
  6    and t3.join_time (+) = t1.join_time
  7  having count(t3.id) = 0
  8  group by t1.id, t1.join_date, t1.join_time, t1.status
  9  /
        ID JOIN_DAT JOIN_T STATUS
---------- -------- ------ ----------
         2 20010103 0502   PASS

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447778 is a reply to message #447717] Wed, 17 March 2010 16:05 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Are you still there?
Is this one of the solutions you wanted? I'm still not sure I understand your requirements.

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447803 is a reply to message #447676] Thu, 18 March 2010 01:22 Go to previous messageGo to next message
ygsunilkumar
Messages: 100
Registered: August 2008
Senior Member
Sorry, the below mentioned query will not work because join_time is there in Table2. Now the table structure has been changed.

Table1 contains these columns:

id, join_date, status

Table2 contains these columns:
id, join_time

Table3 contains these columns:
id,join_date, join_time, status

because of the structure change the outer query restriction i got this error.
select t1.id, t1.join_date, t1.join_time, t1.status
   from table1 t1, table2 t2, table2 t3, table2 t4
   where t2.id = t1.id
   and t3.id = t4.id
   and t2.id = t3.id
     and t4.id (+) = t1.id
  and t4.join_date (+) = t1.join_date
    and t4.join_time (+) = t2.join_time
  having count(t4.id) = 0
  group by t1.id, t1.join_date, t1.join_time, t1.status

ORA-01417: a table may be outer joined to at most one other table

[Updated on: Thu, 18 March 2010 01:30]

Report message to a moderator

Re: Removing Duplicate Rows when condition is matched [message #447813 is a reply to message #447803] Thu, 18 March 2010 02:04 Go to previous messageGo to next message
Michel Cadot
Messages: 64151
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I give up, tired of wasting my time.
Maybe someone else...

Regards
Michel
Re: Removing Duplicate Rows when condition is matched [message #447821 is a reply to message #447803] Thu, 18 March 2010 02:46 Go to previous message
_jum
Messages: 509
Registered: February 2008
Senior Member
1. How You can group by t1.join_time, which column is not in table1 ?!
Quote:

Table1 contains these columns:
id, join_date, status


  ...
 group by t1.id, t1.join_date, t1.join_time , t1.status

2. You can outer join only ONE TABLE, you try to join table1 AND table2 ?!
  ...
   and t4.join_date (+) = t1.join_date --> OUTJOIN 1
   and t4.join_time (+) = t2.join_time --> OUTJOIN 2
  ...

Seems You have all needed columns in table3, so this would be the right one for outer join.

[Updated on: Thu, 18 March 2010 02:47]

Report message to a moderator

Previous Topic: A flexible Constraint
Next Topic: Foreign key constraint
Goto Forum:
  


Current Time: Fri Dec 09 13:50:18 CST 2016

Total time taken to generate the page: 0.12360 seconds