Home » SQL & PL/SQL » SQL & PL/SQL » Special character issues
Special character issues [message #631987] Thu, 22 January 2015 16:03 Go to next message
raopatwariyahoocom
Messages: 39
Registered: October 2011
Location: GA
Member
Hi guys I am bit caught in between, I have some requirement like

Basically, my requirement is to take a string say, 'abc', and remove all but 4 identified special characters. Those characters would be '_', '-', '~', '.', and '^'. These four we want to leave in. The rest, we want to replace with a dash (-).

So, 'abc*' would be 'abc-'. And 'abc~1@' would become 'abc~1-'.

Select 'abc*' c1, 'abc~1@' c2 from dual;

Result will be

SQL> 'abc-' 'abc~1-'

Thanks in advance
Re: Special character issues [message #631988 is a reply to message #631987] Thu, 22 January 2015 17:39 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9092
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl12c> SELECT REGEXP_REPLACE ('abc*', '[^0-9A-Za-z _-~.^]', '-'),
  2  	    REGEXP_REPLACE ('abc~1@', '[^0-9A-Za-z _-~.^]', '-')
  3  FROM   DUAL
  4  /

REGE REGEXP
---- ------
abc- abc~1-

1 row selected.

icon9.gif  Re: Special character issues [message #631990 is a reply to message #631988] Thu, 22 January 2015 18:16 Go to previous messageGo to next message
raopatwariyahoocom
Messages: 39
Registered: October 2011
Location: GA
Member
Thank you very much Barbara. That worked
Re: Special character issues [message #631998 is a reply to message #631987] Fri, 23 January 2015 01:36 Go to previous messageGo to next message
Lalit Kumar B
Messages: 3174
Registered: May 2013
Location: World Wide on the Web
Senior Member
raopatwariyahoocom wrote on Fri, 23 January 2015 03:33
Those characters would be '_', '-', '~', '.', and '^'. These four we want to leave in. The rest, we want to replace with a dash (-).


Your requirement is not clear. Should it replace one or more occurrences together or each at a time?

Barbara's solution would replace each character with '-'. If you want one or more occurrences to be replaced together then you need to add '+'.

SQL> WITH DATA AS
  2    ( SELECT 'abc*!@#$' c1 FROM dual
  3    UNION ALL
  4    SELECT 'abc~1@#$*' FROM dual
  5    )
  6  SELECT c1,
  7    regexp_replace(c1, '[^[:alnum:]_-~.^]+','-') c1_ALL
  8  FROM DATA
  9  /

C1         C1_ALL
---------- ----------
abc*!@#$   abc-
abc~1@#$*  abc~1-

SQL>
Re: Special character issues [message #631999 is a reply to message #631987] Fri, 23 January 2015 02:38 Go to previous messageGo to next message
Michel Cadot
Messages: 68648
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Or with TRANSLATE:
SQL> with 
  2    data as (
  3      Select 'abc*' v from dual union all
  4      select 'abc~1@' from dual
  5    )
  6  select v, 
  7         translate (v, 
  8           'A'||translate (v, 
  9                  '#ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-~.^',
 10                  '#'),
 11           'A'||rpad('-',255,'-')) new_v
 12  from data
 13  /
V      NEW_V
------ ------
abc*   abc-
abc~1@ abc~1-

Harder to read but faster:
SQL> create table t as 
  2   Select 'abc*' v from dual union all
  3   select 'abc~1@' from dual
  4  /

Table created.

SQL> declare
  2    startt pls_integer;
  3  begin
  4    startt := dbms_utility.get_time();
  5    for i in 1..100000 loop
  6      for rec in (
  7        select regexp_replace (v, '[^0-9A-Za-z _-~.^]', '-') from t
  8      ) loop
  9        null;
 10      end loop;
 11    end loop;
 12    dbms_output.put_line('regexp_replace -> '||((dbms_utility.get_time()-startt)/100)||'s');
 13    startt := dbms_utility.get_time();
 14    for i in 1..100000 loop
 15      for rec in (
 16        select translate (v, 
 17                 'A'||translate (v, 
 18                        '#ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-~.^',
 19                        '#'),
 20                 'A'||rpad('-',255,'-')) new_v
 21        from t
 22      ) loop
 23        null;
 24      end loop;
 25    end loop;
 26    dbms_output.put_line('translate      -> '||((dbms_utility.get_time()-startt)/100)||'s');
 27  end;
 28  /
regexp_replace -> 7.36s
translate      -> 5.73s

PL/SQL procedure successfully completed.

[Updated on: Fri, 23 January 2015 03:01]

Report message to a moderator

Re: Special character issues [message #632000 is a reply to message #631999] Fri, 23 January 2015 02:57 Go to previous messageGo to next message
Lalit Kumar B
Messages: 3174
Registered: May 2013
Location: World Wide on the Web
Senior Member
You seem to have missed a comma at the end of line 18.

Anyway, on 12.1.0.1 with 8 GB RAM, the difference is tiny.

SQL> select banner from v$version where rownum = 1;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production

SQL> DECLARE
  2    startt pls_integer;
  3  BEGIN
  4    startt := dbms_utility.get_time();
  5    FOR i IN 1..100000
  6    LOOP
  7      FOR rec IN
  8      ( SELECT regexp_replace (v, '[^0-9A-Za-z _-~.^]', '-') FROM t
  9      )
 10      LOOP
 11        NULL;
 12      END LOOP;
 13    END LOOP;
 14    dbms_output.put_line('regexp_replace -> '||((dbms_utility.get_time()-startt)/100)||'s');
 15    startt := dbms_utility.get_time();
 16    FOR i IN 1..100000
 17    LOOP
 18      FOR rec IN
 19      (SELECT TRANSLATE (v,'A'
 20        ||TRANSLATE (v, '#ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-~.^','#'),'A'
 21        ||rpad('-',255,'-')) new_v
 22      FROM t
 23      )
 24      LOOP
 25        NULL;
 26      END LOOP;
 27    END LOOP;
 28    dbms_output.put_line('translate      -> '||((dbms_utility.get_time()-startt)/100)||'s');
 29  END;
 30  /
regexp_replace -> 2.8s
translate      -> 2.18s

PL/SQL procedure successfully completed.

SQL>


And using the ALNUM character class instead of [0-9A-Za-z], the difference is tinier.

SQL> DECLARE
  2    startt pls_integer;
  3  BEGIN
  4    startt := dbms_utility.get_time();
  5    FOR i IN 1..100000
  6    LOOP
  7      FOR rec IN
  8      ( SELECT regexp_replace (v, '[^[:alnum:]_-~.^]', '-') FROM t
  9      )
 10      LOOP
 11        NULL;
 12      END LOOP;
 13    END LOOP;
 14    dbms_output.put_line('regexp_replace -> '||((dbms_utility.get_time()-startt)/100)||'s');
 15    startt := dbms_utility.get_time();
 16    FOR i IN 1..100000
 17    LOOP
 18      FOR rec IN
 19      (SELECT TRANSLATE (v,'A'
 20        ||TRANSLATE (v, '#ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-~.^','#'),'A'
 21        ||rpad('-',255,'-')) new_v
 22      FROM t
 23      )
 24      LOOP
 25        NULL;
 26      END LOOP;
 27    END LOOP;
 28    dbms_output.put_line('translate      -> '||((dbms_utility.get_time()-startt)/100)||'s');
 29  END;
 30  /
regexp_replace -> 2.63s
translate      -> 2.15s

PL/SQL procedure successfully completed.

SQL>


But yes, if we scale up further, the difference might just increase.
Re: Special character issues [message #632001 is a reply to message #632000] Fri, 23 January 2015 03:00 Go to previous message
Michel Cadot
Messages: 68648
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
You seem to have missed a comma at the end of line 18.


Just a copy and paste issue (otherwise there'd be an error). [fixed]

So, it depends on OP's version.

[Updated on: Fri, 23 January 2015 03:01]

Report message to a moderator

Previous Topic: Help with REGEXP
Next Topic: Error While Fetching Data Through Database Link In Oracle
Goto Forum:
  


Current Time: Thu May 09 07:01:57 CDT 2024