Home » SQL & PL/SQL » SQL & PL/SQL » De-Duping the Table Data (Oracle 10g , Win Xp Professional)
De-Duping the Table Data [message #352617] Wed, 08 October 2008 10:46 Go to next message
Sunny_Vras
Messages: 3
Registered: October 2008
Junior Member
Hi Everyone,

I need your help with deduping the data for which I 'm providing an sample for you :

create table dedupe_example
( Cno Number(6),
Cname Varchar2(10),
Change_eff_DAte date,
end_Eff_Date date,
status varchar2(2)
);

insert into dedupe_example values
( 123456,'Venu','31-Jan-2008','31-DEC-9999','AC');

insert into dedupe_example values
(123456,'Venu','24-Jan-2008','30-JAN-2008','CL');

insert into dedupe_example values
(123456,'Venu','22-OCT-2007','23-JAN-2008','CL');

insert into dedupe_example values
(123456,'Venu','16-AUG-2006','21-OCT-2007','AC');

insert into dedupe_example values
(123456,'Venu','07-JUN-2005','15-AUG-2006','AC');

insert into dedupe_example values
(123456,'Venu','01-JAN-2001','06-JUN-2005','AC');


SELECT * FROM DEDUPE_EXAMPLE;

CNO CNAME CHANGE_EF END_EFF_D ST
---------------------------------
123456 Venu 31-JAN-08 31-DEC-99 AC
123456 Venu 24-JAN-08 30-JAN-08 CL
123456 Venu 22-OCT-07 23-JAN-08 CL
123456 Venu 16-AUG-06 21-OCT-07 AC
123456 Venu 07-JUN-05 15-AUG-06 AC
123456 Venu 01-JAN-01 06-JUN-05 AC


Here in the subset table, I have a duplicate records (the ones in italics ).So i need to eliminate them and adjust the change_Eff_Date and end_Eff_DAte for the previous record. I tried group by and tried using Analytics but was unable to get the desired result as you can see the status has been changed back to Active (AC) after being closed(CL) for a while.


My Desired Output is :

CNO CNAME CHANGE_EF END_EFF_D ST
---------------------------------
123456 Venu 31-JAN-08 31-DEC-99 AC
123456 Venu 22-OCT-07 30-JAN-08 CL
123456 Venu 01-JAN-01 21-OCT-07 AC


Thanks.

Venu
Re: De-Duping the Table Data [message #352620 is a reply to message #352617] Wed, 08 October 2008 11:04 Go to previous messageGo to next message
joicejohn
Messages: 327
Registered: March 2008
Location: India
Senior Member
@Sunny_Vras,
Sunny_Vras wrote on Wed, 08 October 2008 21:16


CNO CNAME CHANGE_EF END_EFF_D ST
---------------------------------
123456 Venu 31-JAN-08 31-DEC-99 AC
123456 Venu 24-JAN-08 30-JAN-08 CL
123456 Venu 22-OCT-07 23-JAN-08 CL
123456 Venu 16-AUG-06 21-OCT-07 AC
123456 Venu 07-JUN-05 15-AUG-06 AC
123456 Venu 01-JAN-01 06-JUN-05 AC

Here in the subset table, I have a duplicate records (the ones in italics ).


I didn't understand based on what criteria you have mentioned the italized records as duplicates. Can you please clarify?

Regards,
Jo
Re: De-Duping the Table Data [message #352621 is a reply to message #352620] Wed, 08 October 2008 11:07 Go to previous messageGo to next message
Sunny_Vras
Messages: 3
Registered: October 2008
Junior Member
Sorry Jo ,

They are considered Duplicates as there is no difference in the data apart from the change_eff_Date and end_eff_Date.(In the above typical example the same combination of cno,cname,status over continous time period )

Thanks.
Venu
Re: De-Duping the Table Data [message #352622 is a reply to message #352617] Wed, 08 October 2008 11:07 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
I have a duplicate records (the ones in italics ).

I don't see any duplicates. Explain why there are duplicates.

Quote:
adjust the change_Eff_Date and end_Eff_DAte for the previous record

Previous? what is previous? above all in case of duplicates?

Your requirements are clear as mud.

Also, please read OraFAQ Forum Guide, especially "How to format your post?" section.
Make sure that lines of code do not exceed 80 characters when you format.
Use the "Preview Message" button to verify.
Always post your Oracle version (4 decimals).

Regards
Michel
Re: De-Duping the Table Data [message #352624 is a reply to message #352617] Wed, 08 October 2008 11:11 Go to previous messageGo to next message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
Post what you have tried so far.

I will give you a clue how to do it.

Step 1 : Output a value (rownum) in a column when the previous value is not equal to current value partition by the id value (Use Lag function).
Step 2 : Repeat the same value across all the columns. Either use max in aggregate form (if you are working on oracle 9). If you are working on Oracle 10 use the last_value analytic functions.
Step 3 : Group the entries based on the id and the new id column which you have created and use the max and min function for the respective columns.

Hope this helps.

If you are stuck somewhere copy and paste your complete sql*plus session and come back to us.

Last but not least please follow the forum guidelines.

Regards

Raj
Re: De-Duping the Table Data [message #352629 is a reply to message #352621] Wed, 08 October 2008 11:49 Go to previous messageGo to next message
joicejohn
Messages: 327
Registered: March 2008
Location: India
Senior Member
No Message Body

[Updated on: Wed, 08 October 2008 11:55]

Report message to a moderator

Re: De-Duping the Table Data [message #352638 is a reply to message #352624] Wed, 08 October 2008 13:41 Go to previous messageGo to next message
Sunny_Vras
Messages: 3
Registered: October 2008
Junior Member
Hi Raj,

I am been struck in the 1st step itself. As you can see that in 4th Record , it shouldn't have a lag value
as it is not continous record when seen in date perspective.

SQL> ed
Wrote file C:\VENU\AFIEDT.BUF

1 select a.* ,
2 lag(rownum) over (partition by cno,cname,status
3 order by change_Eff_date desc) "rno"
4 from
5 dedupe_Example a
6* order by 3 desc
SQL> /

CNO CNAME CHANGE_EF END_EFF_D ST rno
---------- ---------- --------- --------- -- ----------
123456 Venu 31-JAN-08 31-DEC-99 AC
123456 Venu 24-JAN-08 30-JAN-08 CL
123456 Venu 22-OCT-07 23-JAN-08 CL 2
123456 Venu 16-AUG-06 21-OCT-07 AC 1
123456 Venu 07-JUN-05 15-AUG-06 AC 4
123456 Venu 01-JAN-01 06-JUN-05 AC 5

6 rows selected.

Thanks.

Venu
Re: De-Duping the Table Data [message #352639 is a reply to message #352638] Wed, 08 October 2008 13:59 Go to previous messageGo to next message
Michel Cadot
Messages: 64139
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Michel Cadot wrote on Wed, 08 October 2008 18:07
Also, please read OraFAQ Forum Guide, especially "How to format your post?" section.
Make sure that lines of code do not exceed 80 characters when you format.
Use the "Preview Message" button to verify.
Always post your Oracle version (4 decimals).

Regards
Michel


Indent the code (See SQL Formatter) and align the columns in result.
Re: De-Duping the Table Data [message #352723 is a reply to message #352638] Thu, 09 October 2008 03:30 Go to previous message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
O.K I will help you out how to do the first part. Replace your lag function in this form
case when status != lag(status) over(partition by cno order by change_eff_date asc) then rownum end

I will suggest you to spend some time reading the forum guidelines on how to format your post.

Regards

Raj
Previous Topic: Simple proc throwing error ORA-00604, ORA-01401
Next Topic: Query when we have a method in table definition
Goto Forum:
  


Current Time: Thu Dec 08 12:11:33 CST 2016

Total time taken to generate the page: 0.07005 seconds