Home » SQL & PL/SQL » SQL & PL/SQL » Eliminate duplicate emails in String
Eliminate duplicate emails in String [message #295597] Tue, 22 January 2008 19:22 Go to next message
Mahesh P
Messages: 69
Registered: September 2004
Member
Hi Techies,

Is there any oracle function to eliminate the duplicates in a string:

For Example:

In my procedure i am having a Varchar2 variable which will be assigned a value like "abc@yahoo.com;bca@yahoo.com;abc@yahoo.com;bca@yahoo.com".

If you observe this string, this has duplicate email id's. how can i eliminate the duplicates from this string to result only "abc@yahoo.com;bca@yahoo.com".

Thanks alot
Mahesh
Re: Eliminate duplicate emails in String [message #295606 is a reply to message #295597] Tue, 22 January 2008 22:01 Go to previous messageGo to next message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_11g> CREATE OR REPLACE TYPE varchar2_tt AS TABLE OF VARCHAR2 (4000);
  2  /

Type created.

SCOTT@orcl_11g> CREATE OR REPLACE FUNCTION elim_dups
  2    (p_list	 IN  VARCHAR2,
  3  	p_delim  IN  VARCHAR2 DEFAULT ';')
  4    RETURN	     VARCHAR2
  5  AS
  6    v_list_in     VARCHAR2 (32767) := p_list || p_delim;
  7    v_items_in    varchar2_tt      := varchar2_tt();
  8    v_items_out   varchar2_tt      := varchar2_tt();
  9    v_list_out    VARCHAR2 (32767);
 10  BEGIN
 11    WHILE v_list_in IS NOT NULL LOOP
 12  	 v_items_in.EXTEND;
 13  	 v_items_in (v_items_in.COUNT) := SUBSTR (v_list_in, 1, INSTR (v_list_in, p_delim) - 1);
 14  	 v_list_in := SUBSTR (v_list_in, INSTR (v_list_in, p_delim) + LENGTH (p_delim));
 15    END LOOP;
 16    --
 17    SELECT DISTINCT column_value
 18    BULK COLLECT INTO v_items_out
 19    FROM   TABLE (CAST (v_items_in AS varchar2_tt))
 20    ORDER  BY column_value;
 21    --
 22    FOR i IN 1 .. v_items_out.COUNT LOOP
 23  	 v_list_out := v_list_out || p_delim || v_items_out(i);
 24    END LOOP;
 25    --
 26    RETURN LTRIM (v_list_out, p_delim);
 27  END elim_dups;
 28  /

Function created.

SCOTT@orcl_11g> SHOW ERRORS
No errors.
SCOTT@orcl_11g> SET SERVEROUTPUT ON
SCOTT@orcl_11g> DECLARE
  2    your_variable VARCHAR2 (32767);
  3  BEGIN
  4    your_variable := 'abc@yahoo.com;bca@yahoo.com;abc@yahoo.com;bca@yahoo.com';
  5    your_variable := elim_dups (your_variable);
  6    DBMS_OUTPUT.PUT_LINE (your_variable);
  7  END;
  8  /
abc@yahoo.com;bca@yahoo.com

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> 

Re: Eliminate duplicate emails in String [message #295652 is a reply to message #295597] Wed, 23 January 2008 00:30 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Here's an example in SQL, adapt it for your needs:
SQL> select * from t order by id;
        ID VALS
---------- ------------------------------
         1 123; 345; 123; 567;
         2 abc; cde; abc; efg;

2 rows selected.

SQL> def max_vals=10
SQL> col vals format a30
SQL> with 
  2    step1 as (
  3      select id, vals, rn
  4      from t, (select rownum rn from dual connect by level <= &max_vals)
  5    ),
  6    step2 as (
  7      select distinct
  8             id,
  9             trim(substr(vals, 
 10                         decode(rn,1,0,instr(vals,';',1,rn-1))+1,
 11                         instr(vals,';',1,rn)
 12                         -decode(rn,1,0,instr(vals,';',1,rn-1))-1)) val
 13      from step1
 14      where instr(vals,';',1,rn) > 0 
 15    ),
 16    step3 as (
 17      select id, val, 
 18             row_number () over (partition by id order by val) curr,
 19             row_number () over (partition by id order by val)-1 prev
 20      from step2
 21    )
 22  select id, max(substr(sys_connect_by_path(val,';'),2))||';' vals
 23  from step3
 24  connect by prior id = id and prior curr = prev
 25  start with prev = 0
 26  group by id
 27  /
        ID VALS
---------- ------------------------------
         1 123;345;567;
         2 abc;cde;efg;

2 rows selected.

Regards
Michel
Re: Eliminate duplicate emails in String [message #295655 is a reply to message #295652] Wed, 23 January 2008 00:40 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Or more modern form:
SQL> select * from t order by id;
        ID VALS
---------- ------------------------------
         1 123; 345; 123; 567
         2 abc; cde; abc; efg

2 rows selected.

SQL> with
  2    lines as ( select level line from dual connect by level <= &max_vals ),
  3    data as (
  4      select distinct
  5             id,
  6             trim(substr(vals, 
  7                         instr(';'||vals||';',';',1,line),
  8                         instr(';'||vals||';',';',1,line+1)
  9                         -instr(';'||vals||';',';',1,line)-1)) val
 10      from t, lines
 11      where instr(vals||';',';',1,line) > 0 
 12    ),
 13    nums as (
 14      select id, val,
 15             row_number() over (partition by id order by val) rn,
 16             count(*) over (partition by id) nb
 17      from data
 18    )
 19  select id, substr(sys_connect_by_path(val,';'),2) vals
 20  from nums
 21  where rn = nb
 22  connect by prior id = id and prior rn = rn-1
 23  start with rn = 1
 24  order by id
 25  /
        ID VALS
---------- ------------------------------
         1 123;345;567
         2 abc;cde;efg

2 rows selected.

Regards
Michel

[Edit: fixed following Rajuvan's comments below.]

[Updated on: Wed, 23 January 2008 02:03]

Report message to a moderator

Re: Eliminate duplicate emails in String [message #295680 is a reply to message #295597] Wed, 23 January 2008 01:36 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member


These queries holds fine as long as OP doesn't bother about the change in the display order of emai-ld from Actual One.

SQL> select ID ,VAL from T;

        ID VAL
---------- --------------------------------------------------
         1 xyz;cde;xyz;efg;abc
         2 abc;cde;abc;efg;hij

SQL> with
  2       lines as ( select level line from dual connect by level <=10 ),
  3      data as (
  4         select distinct
  5               id,
  6                trim(substr(val,
  7                            instr(';'||val||';',';',1,line),
  8                            instr(';'||val||';',';',1,line+1)
  9                            -instr(';'||val||';',';',1,line)-1)) val
 10         from t, lines
 11         where instr(val,';',1,line) > 0
 12       ),
 13       nums as (
 14         select id, val,
 15                row_number() over (partition by id order by val) rn,
 16                count(*) over (partition by id) nb
 17         from data
 18       )
 19     select id, substr(sys_connect_by_path(val,';'),2) val
 20     from nums
 21    where rn = nb
 22     connect by prior id = id and prior rn = rn-1
 23     start with rn = 1
 24    order by id;

        ID VAL
---------- --------------------------------------------------
         1 cde;efg;xyz
         2 abc;cde;efg

SQL>


Thumbs Up
Rajuvan.
Re: Eliminate duplicate emails in String [message #295683 is a reply to message #295680] Wed, 23 January 2008 01:43 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Yes, this is the effect of "distinct".
I didn't precise it, I should do, it is not obvious for everyone.

Regards
Michel
Re: Eliminate duplicate emails in String [message #295684 is a reply to message #295597] Wed, 23 January 2008 01:46 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

But dont you feel Some issue with the output itself?
Actual value is having 4 distinct values but query returns only 3 .

Thumbs Up
Rajuvan.
Re: Eliminate duplicate emails in String [message #295687 is a reply to message #295597] Wed, 23 January 2008 01:51 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

I think

where instr(val,';',1,line) > 0

should be changed to

where instr(val||';',';',1,line) > 0 


Thumbs Up
Rajuvan.
Re: Eliminate duplicate emails in String [message #295691 is a reply to message #295680] Wed, 23 January 2008 01:58 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
If you want to keep the order, you have to insert one more step to number the items inside the string and change distinct to group.
Assuming the order is defined with the first item for each value:
SQL> select * from t order by id;
        ID VALS
---------- ------------------------------
         1 987;123;345;123;567
         2 xyz;abc;cde;abc;efg

2 rows selected.

SQL> def max_vals=10
SQL> col vals format a30
SQL> with
  2    lines as ( select level line from dual connect by level <= &max_vals ),
  3    data as (
  4      select id,
  5             trim(substr(vals, 
  6                         instr(';'||vals||';',';',1,line),
  7                         instr(';'||vals||';',';',1,line+1)
  8                         -instr(';'||vals||';',';',1,line)-1)) val,
  9             row_number() over (partition by id order by line) rn
 10      from t, lines
 11      where instr(vals||';',';',1,line) > 0 
 12    ),
 13    data2 as (
 14      select id, val, min(rn) rn
 15      from data
 16      group by id, val
 17      order by id, rn
 18    ),
 19    nums as (
 20      select id, val,
 21             row_number() over (partition by id order by rn) rn,
 22             count(*) over (partition by id) nb
 23      from data2
 24    )
 25  select id, substr(sys_connect_by_path(val,';'),2) vals
 26  from nums
 27  where rn = nb
 28  connect by prior id = id and prior rn = rn-1
 29  start with rn = 1
 30  order by id
 31  /
        ID VALS
---------- ------------------------------
         1 987;123;345;567
         2 xyz;abc;cde;efg

2 rows selected.

Regards
Michel
Re: Eliminate duplicate emails in String [message #295692 is a reply to message #295687] Wed, 23 January 2008 01:59 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Yes you are right I fixed it in my previous post.
It worked because the requirements for the problem I solved a couple of years ago were different: the string always ended with a ';'.
I will fix it in the previous one. [Done]

Regards
Michel

[Updated on: Wed, 23 January 2008 02:04]

Report message to a moderator

Re: Eliminate duplicate emails in String [message #295855 is a reply to message #295597] Wed, 23 January 2008 14:20 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member


well, since everyone else if offing their solutions, here is another one. I shortened your emails for easier reading.

Assuming your email list started out as 'abc;bca;abc;bca', the following query:

select substr(sys_connect_by_path(email_string, ';'),2) email_list
from (
       select rownum i,email_string
       from (
              select distinct
                     substr(      ';'||'abc;bca;abc;bca'||';'
                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
                           ) email_string
              from dual
              connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
           )
     )
where connect_by_isleaf = 1
connect by prior i = i-1
start with i = 1
/

returns this:

EMAIL_LIST
---------------------------------------------------------------------
abc;bca


Kind of a neat abuse of features I think. No, I didn't figure this out for myself, I am not that smart. But I can google and a few years back I stumbled across these techniques on ASKTOMHOME and other sites.

For those looking for the quick points of interest:
Quote:


we use a FAT delimited string (eg. delimiters on both sides) to make the string manipulation easier. Hence 'abc;bca;abc;bca' become ';'||'abc;bca;abc;bca'||';'

we use the EASY COUNTER OF DELIMITERS method to count the number of delimiters in the string and thus know how many elements there are. This is the length math.

we use the CONNECT BY LEVEL trick to duplicate rows

do string operations to pluck out of our starter string, that email address which corresponds to the row-number of the duplicated rows we are looking at. Thus row 1 yeilds string 1, row 2 yields string 2, etc.

distinct results to remove duplicates

add a rownum (we labeled I) to the d-dupeed set or emails. We are going to use this to fake a hierarchy in a minute (ah the beauty of abusing Oracle, so satisfying).

fake a hierarchy using the bogus I=I-1

use sys_connect_by_path on this faked out hierarchy to produce a delimlited string on some column, in this case or list of de-duped emails.

start with the first row in our de-duped emails. This is the start with i = 1.

take only leaf nodes using the 10g function connect_by_isleaf which in this case since we started with the first row in our fake hierarchy, means we will have only one leaf node. This leaf node is our list of delimited emails.



If someone wanted to keep emails in their original order, then you could use a max analytic function in the inner query and rework the sql. I leave this as an exercise for the OP if there is interest enough, it would be a real cool piece of code.

Good luck, Kevin
Re: Eliminate duplicate emails in String [message #295857 is a reply to message #295855] Wed, 23 January 2008 14:48 Go to previous messageGo to next message
Mahesh P
Messages: 69
Registered: September 2004
Member
This is excellent guys. i can extract the code and use it as i needed.

thanks to all of you guys for your contribution.

this is a wonderful forum i have ever seen..

bye take care
Mahesh
Re: Eliminate duplicate emails in String [message #295922 is a reply to message #295597] Wed, 23 January 2008 22:56 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

More compace code ,

But seems to have one issue with the Output.

SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2     from (
  3            select rownum i,email_string
  4            from (
  5                   select distinct
  6                          substr(      ';'||'abc;bca;abc;bca'||';'
  7                                ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8                                ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9                                ) email_string
 10                   from dual
 11                   connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12                )
 13          )
 14     where connect_by_isleaf = 1
 15     connect by prior i = i-1
 16     start with i = 1;

EMAIL_LIST
------------------------------
bca;bca

SQL>


Thumbs Up
Rajuvan

[Updated on: Wed, 23 January 2008 23:06]

Report message to a moderator

Re: Eliminate duplicate emails in String [message #296068 is a reply to message #295597] Thu, 24 January 2008 08:06 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
Not for me, what did you do wrong? Also, what did you do with the statement when you cut/pasted it around. The indentation is not the same as what I posted earlier(not that this is a big deal), but it suggests someone may have changed something along the way. In any event, I run it, I get a good answer.

SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2     from (
  3            select rownum i,email_string
  4            from (
  5                   select distinct
  6                          substr(      ';'||'abc;bca;abc;bca'||';'
  7                                ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8                                ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9                                ) email_string
 10                   from dual
 11                   connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12                )
 13          )
 14     where connect_by_isleaf = 1
 15     connect by prior i = i-1
 16     start with i = 1;

EMAIL_LIST
-----------------------------------------------------------------------------------------------------------------------------------------
abc;bca


1 row selected.

Kevin
Re: Eliminate duplicate emails in String [message #296071 is a reply to message #296068] Thu, 24 January 2008 08:48 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
I got the following:
SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3        select rownum i,email_string
  4        from (
  5              select distinct
  6                     substr(      ';'||'abc;bca;abc;bca'||';'
  7                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1

  9                           ) email_string
 10              from dual
 11              connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12             )
 13      )
 14  where connect_by_isleaf = 1
 15  connect by prior i = i-1
 16  start with i = 1;
            from dual
                 *
ERROR at line 10:
ORA-01428: argument '0' is out of range

Regards
Michel
Re: Eliminate duplicate emails in String [message #296077 is a reply to message #295597] Thu, 24 January 2008 09:05 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
needs a 10g database. Again, here is my original paste, not the one offered by rajavu1.

select substr(sys_connect_by_path(email_string, ';'),2) email_list
from (
       select rownum i,email_string
       from (
              select distinct
                     substr(      ';'||'abc;bca;abc;bca'||';'
                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
                           ) email_string
              from dual
              connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
           )
     )
where connect_by_isleaf = 1
connect by prior i = i-1
start with i = 1
/

Kevin
Re: Eliminate duplicate emails in String [message #296157 is a reply to message #295597] Thu, 24 January 2008 22:03 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

Result remains same for me in 10g . Nothing is changed from you code.

SQL> select banner from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3         select rownum i,email_string
  4         from (
  5                select distinct
  6                       substr(      ';'||'abc;bca;abc;bca'||';'
  7                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)
+1
  8                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+
1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9                             ) email_string
 10                from dual
 11                connect by level <= length('abc;bca;abc;bca')-length(replace(
'abc;bca;abc;bca',';'))+1
 12             )
 13       )
 14  where connect_by_isleaf = 1
 15  connect by prior i = i-1
 16  start with i = 1
 17  /

EMAIL_LIST
--------------------------------------------------------------------------------

bca;bca

SQL>


Thumbs Up
Rajuvan
Re: Eliminate duplicate emails in String [message #296162 is a reply to message #295597] Thu, 24 January 2008 22:45 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
Well, something must be different, cause it works just dandy for me. It could be something simple like line wraps in your editor or something (I note that your post once again does not match mine if only because of the nasty line wrapping going on), or it could be a true bug in your Oracle version. Again, here is my post:

SQL*Plus: Release 9.2.0.1.0 - Production on Thu Jan 24 23:26:29 2008

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.5.0 - 64bit Production
With the Partitioning and Data Mining options

SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3         select rownum i,email_string
  4         from (
  5                select distinct
  6                       substr(      ';'||'abc;bca;abc;bca'||';'
  7                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9                             ) email_string
 10                from dual
 11                connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12             )
 13       )
 14  where connect_by_isleaf = 1
 15  connect by prior i = i-1
 16  start with i = 1
 17  /

EMAIL_LIST
------------------------------------------------------------------------------------------------------------------------------------
abc;bca

1 row selected.

SQL> 

I might suggest the following approaches:

1) open a TAR with Oracle Corp.
2) write the query from scratch yourself instead of doing cut/paste of code. I have seen cases where editors hack things from time to time.
3) review the code for correctness. By this I mean, do the tried and true DESK CHECK that we all learned in programing 101. It might take you 20 minutes, but you should be able to satisfy yourself as to the answer you should get and thus see their either a) the query you are executing is hacked somehow, or b) the version of Oracle you are using has a bug.

Let me ask you this: are you posting your execution example back in the thread here because

1) you think the code I supplied is in error?

or

2) you know the code I supplied is correct and the answer shown in my posts is the right one, but you can't explain why when it runs on your system, you get a wrong answer?

The code I offered is good and provides the answer shown by me so you should be answering with #2. I am not trying to be rude here rajavu1. I am honestly interested in helping figure out what is happening. But, the problem is not that the code is wrong, it is that when you execute it on your system, for some reason you are getting the wrong answer. If you DESK CHECK the code you will confirm the answer you should be getting.

Good luck, Kevin
Re: Eliminate duplicate emails in String [message #296166 is a reply to message #295597] Thu, 24 January 2008 23:27 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

Actually I am not pretty sure how it is working . I got the desired result by changing the Connect by clause .

Here with connect by prior i = i-1

SQL> select banner from V$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3         select rownum i,email_string
  4         from (
  5                select distinct
  6                       substr(      ';'||'abc;bca;abc;bca'||';'
  7                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9                             ) email_string
 10                from dual
 11                connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12             )
 13       )
 14  where connect_by_isleaf = 1
 15  connect by prior i = i-1
 16  start with i = 1;

EMAIL_LIST
------------------------------
bca;bca



Now with connect by i = prior i + 1

SQL> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2   from (
  3        select rownum i,email_string
  4         from (
  5                select distinct
  6                       substr(      ';'||'abc;bca;abc;bca'||';'
  7                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8                             ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9                             ) email_string
 10                from dual
 11                connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12             ))
 13     where   connect_by_isleaf = 1
 14     connect by  i = prior i + 1
 15  start with i = 1;

EMAIL_LIST
------------------------------
bca;abc

SQL>


Hmm.. Now I am interested to know whether this is working differently for you Smile

Thumbs Up
Rajuvan.

[Updated on: Thu, 24 January 2008 23:30]

Report message to a moderator

Re: Eliminate duplicate emails in String [message #296175 is a reply to message #296166] Fri, 25 January 2008 00:07 Go to previous messageGo to next message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_11g> select banner from v$version
  2  /

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE	11.1.0.6.0	Production
TNS for 32-bit Windows: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

SCOTT@orcl_11g> -- original copied and pasted from Kevin's post:
SCOTT@orcl_11g> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3  	    select rownum i,email_string
  4  	    from (
  5  		   select distinct
  6  			  substr(      ';'||'abc;bca;abc;bca'||';'
  7  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9  				) email_string
 10  		   from dual
 11  		   connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12  		)
 13  	  )
 14  where connect_by_isleaf = 1
 15  connect by prior i = i-1
 16  start with i = 1
 17  /
select substr(sys_connect_by_path(email_string, ';'),2) email_list
*
ERROR at line 1:
ORA-01428: argument '0' is out of range


SCOTT@orcl_11g> -- with modified connect by clause:
SCOTT@orcl_11g> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3  	    select rownum i,email_string
  4  	    from (
  5  		   select distinct
  6  			  substr(      ';'||'abc;bca;abc;bca'||';'
  7  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9  				) email_string
 10  		   from dual
 11  		   connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12  		)
 13  	  )
 14  where connect_by_isleaf = 1
 15  connect by i = prior i+1
 16  start with i = 1
 17  /

EMAIL_LIST
--------------------------------------------------------------------------------
bca;abc

SCOTT@orcl_11g>

Re: Eliminate duplicate emails in String [message #296270 is a reply to message #295597] Fri, 25 January 2008 08:12 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
way to go Rajuvan and Barbara. Who ever said I was not a dope once in a while. Such a stupid mistake on my part.

So, I guess we can say that in fact the code I supplied only worked on my box by accident because indeed it did have an error.

My thanks to all who corrected me. I would hate to have the OP move forward with a bad piece of work. I think we can put this one to bed.

Again thanks, Kevin (eating a bit of crow) Meade.
Re: Eliminate duplicate emails in String [message #296304 is a reply to message #296270] Fri, 25 January 2008 11:42 Go to previous messageGo to next message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
Kevin Meade wrote on Fri, 25 January 2008 06:12

So, I guess we can say that in fact the code I supplied only worked on my box by accident because indeed it did have an error.



Kevin,

There should not be any difference between

connect by prior i = i-1

and

connect by i = prior i+1

Your inner sub-query returns proper results and, if you substitute those results for the inner query, your outer query returns the proper results with either connect by clause. The problem is that when you combine the methods in your inner query and your outer query, the results become unstable.

Given different data and statistics, either one might produce an error or wrong results. It becomes dependent upon the execution plan.

I have provided a brief demonstration below, demonstrating the correct results of the inner query, the correct results with either connect by clause substituting the results of the inner query. I have also provided an additional query, using the sub-query factoring (with) clause, that returns the correct results with your original connect by clause. However, the point is that it may not always do so.

What is debatable is whether this constitutes a bug or an error or what. I believe many would probably say that this is one of the pitfalls warned about with these row generation techniques or tricks and such. I experimented a little with other row generation techniques and the undocumented materialize hint and a few other things but could not find a way to stabilize the plan for consistent results.



SCOTT@orcl_11g> select rownum i,email_string
  2  	    from (
  3  		   select distinct
  4  			  substr(      ';'||'abc;bca;abc;bca'||';'
  5  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  6  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  7  				) email_string
  8  		   from dual
  9  		   connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 10  		)
 11  /

         I EMAIL_STRING
---------- -----------------
         1 bca
         2 abc

SCOTT@orcl_11g> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3  	    select 1 i, 'bca' email_string from dual
  4  	    union all
  5  	    select 2 i, 'abc' email_string from dual
  6  	   )
  7  where connect_by_isleaf = 1
  8  connect by prior i = i-1
  9  start with i = 1
 10  /

EMAIL_LIST
--------------------------------------------------------------------------------
bca;abc

SCOTT@orcl_11g> select substr(sys_connect_by_path(email_string, ';'),2) email_list
  2  from (
  3  	    select 1 i, 'bca' email_string from dual
  4  	    union all
  5  	    select 2 i, 'abc' email_string from dual
  6  	   )
  7  where connect_by_isleaf = 1
  8  connect by i = prior i+1
  9  start with i = 1
 10  /

EMAIL_LIST
--------------------------------------------------------------------------------
bca;abc

SCOTT@orcl_11g> with subquery as
  2  	  (
  3  	    select rownum i,email_string
  4  	    from (
  5  		   select distinct
  6  			  substr(      ';'||'abc;bca;abc;bca'||';'
  7  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
  8  				,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
  9  				) email_string
 10  		   from dual
 11  		   connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
 12  		)
 13  	  )
 14  select substr(sys_connect_by_path(email_string, ';'),2) email_list
 15  from  subquery
 16  where connect_by_isleaf = 1
 17  connect by prior i = i-1
 18  start with i = 1
 19  /

EMAIL_LIST
--------------------------------------------------------------------------------
abc;bca

SCOTT@orcl_11g> 
 







Re: Eliminate duplicate emails in String [message #296306 is a reply to message #295597] Fri, 25 January 2008 12:01 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
hmm... OK I am confused now. I thought I had not placed the prior keyword in my code and that was what the issue had been.

However, as I review all the posts it seems this is not so. So, I will say the following:

1) the code I posted works
2) it works every time I run it
3) I have seen no evidence that the connect by level trick is faulty in any way.

So, I am sticking with my guns and saying, Nope, I don't buy it.

All the "reposts" of my code by people had something wrong with them in that they were not the original post of code I supplied. It may be the only thing they did was a series of cut/paste operations but in the end even this may have cracked their version of the code. Indeed, when I try to cut/paste some of the other versions of my code posted back by people, I can't do it because there seem to be hidden characters at the end of lines somewhere.

So, I am chalking all this up to "someone's editor put hidden special characters in the code somewhere".

I don't believe there is any instability in the connect by level technique. If you want to really be safe then wrap the connect by level in another layer of select as in:

select * from (select rownum from dual connect by level <= 3)

If you guys think this code don't work on your box for whatever reason, then you need to place a TAR in with METALINK cause this would be a bug, not an invalid use of an undocumented feature.

Still, I appreciate all the attention everyone has paid here, thanks, Kevin.
Re: Eliminate duplicate emails in String [message #296309 is a reply to message #296306] Fri, 25 January 2008 12:24 Go to previous messageGo to next message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
Kevin Meade wrote on Fri, 25 January 2008 10:01



All the "reposts" of my code by people had something wrong with them in that they were not the original post of code I supplied. It may be the only thing they did was a series of cut/paste operations but in the end even this may have cracked their version of the code. Indeed, when I try to cut/paste some of the other versions of my code posted back by people, I can't do it because there seem to be hidden characters at the end of lines somewhere.




Let's try to confirm that. Here is a copy and paste of your original code without any line numbers or anything. All I did was copy and paste it into a .sql file in notepad, run it to confirm that I got the error, then copy and paste it back. It produced an error whether I pasted it directly into SQL*Plus or into a .sql file and started it or spooled the results to a .txt file. Please run it and post the results. If you get the error that we are getting, then we know to look for those hidden characters. If you do not get an error and the code produces the expected results, then we know that it is version dependent. I am convinced that it is version dependent, because when I got the correct results, all I did was change the connect by clause, without altering any of the rest of your code.


select substr(sys_connect_by_path(email_string, ';'),2) email_list
from (
       select rownum i,email_string
       from (
              select distinct
                     substr(      ';'||'abc;bca;abc;bca'||';'
                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)+1
                           ,instr(';'||'abc;bca;abc;bca'||';',';',1,rownum+1)-instr(';'||'abc;bca;abc;bca'||';',';',1,rownum)-1
                           ) email_string
              from dual
              connect by level <= length('abc;bca;abc;bca')-length(replace('abc;bca;abc;bca',';'))+1
           )
     )
where connect_by_isleaf = 1
connect by prior i = i-1
start with i = 1
/


Re: Eliminate duplicate emails in String [message #296310 is a reply to message #296306] Fri, 25 January 2008 12:31 Go to previous messageGo to next message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
Kevin,

What operating system are you using? I am using Windows Vista. When you see differences in characters at ends of line, it is usually due to carriage returns used by different operating systems. They are usually some combination of chr(10), chr(12), chr(13).



Re: Eliminate duplicate emails in String [message #296311 is a reply to message #295597] Fri, 25 January 2008 12:33 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
OK, I cut/paste it from your post, it works find and gives the correct answer. So which version works for you? I ran both versions of connect by and it works the same for me.

Do you think you will open a TAR? You have an easily reproducable test case so they should take it pretty quick.

Kevin

Re: Eliminate duplicate emails in String [message #296312 is a reply to message #295597] Fri, 25 January 2008 12:40 Go to previous messageGo to next message
Kevin Meade
Messages: 2101
Registered: December 1999
Location: Connecticut USA
Senior Member
database is on some flavor os sun/os

sqlplus is on windows xp

Kevin
Re: Eliminate duplicate emails in String [message #296330 is a reply to message #296311] Fri, 25 January 2008 16:33 Go to previous messageGo to next message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
Kevin Meade wrote on Fri, 25 January 2008 10:33

So which version works for you?



I already demonstrated that in my previous posts.

Kevin Meade wrote on Fri, 25 January 2008 10:33

Do you think you will open a TAR?




No. I am retired, so I do not have support or access to Metalink. I see that Michel Cadot also got the same error as I did. Perhaps he will.

Re: Eliminate duplicate emails in String [message #296332 is a reply to message #296311] Fri, 25 January 2008 17:47 Go to previous message
Barbara Boehmer
Messages: 8631
Registered: November 2002
Location: California, USA
Senior Member
I was trying to narrow down where the problem is and produce the simplest possible test case. It seems to only happen with the combination of sys_connect_by_path, row generation using connect by level in an inner sub-query, and "connect by prior ... = ..." instead of "connect by ... = prior ...". The first two queries below worked fine. The third seemed to hang and I ended up killing the session.


SCOTT@orcl_11g> select	sys_connect_by_path (rn, ',') scbp
  2  from    (select rownum rn
  3  	      from   user_objects
  4  	      where  rownum <= 2)
  5  start   with rn = 1
  6  connect by prior rn = rn - 1
  7  /

SCBP
--------------------------------------------------------------------------------
,1
,1,2

SCOTT@orcl_11g> select	sys_connect_by_path (rn, ',') scbp
  2  from    (select rownum rn
  3  	      from   dual
  4  	      connect by level <= 2)
  5  start   with rn = 1
  6  connect by rn = prior rn + 1
  7  /

SCBP
--------------------------------------------------------------------------------
,1
,1,2

SCOTT@orcl_11g> select	sys_connect_by_path (rn, ',') scbp
  2  from    (select rownum rn
  3  	      from   dual
  4  	      connect by level <= 2)
  5  start   with rn = 1
  6  connect by prior rn = rn - 1
  7  /
         from   dual
                *
ERROR at line 3:
ORA-00028: your session has been killed


ERROR:
ORA-03114: not connected to ORACLE


SCOTT@orcl_11g> 


Previous Topic: comments on approach
Next Topic: How to incorporatet file that will be written to AS using DS package
Goto Forum:
  


Current Time: Mon Dec 05 06:41:23 CST 2016

Total time taken to generate the page: 0.08974 seconds