Home » SQL & PL/SQL » SQL & PL/SQL » REGEXP_REPLACE to remove non-native english characters (Oracle 11g)
REGEXP_REPLACE to remove non-native english characters [message #632436] Fri, 30 January 2015 15:59 Go to next message
wtolentino
Messages: 398
Registered: March 2005
Senior Member
we are receiving data feeds from other source. this data feed has some characters that we our system cannot recognize. so we built a function to remove them.

replace(REGEXP_REPLACE (pStr,'[:alpha:][:alnum:][:graph:][:punct:][:upper:]'),chr(191),'')


the code above does not work when there is characters in non-native english. for example é and Ž. how do i include it on the regexp_replace? please advise. thank you.

warren
Re: REGEXP_REPLACE to remove non-native english characters [message #632441 is a reply to message #632436] Sat, 31 January 2015 00:43 Go to previous messageGo to next message
Michel Cadot
Messages: 68645
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Don't use REGEXP_REPLACE, use TRANSLATE.

Re: REGEXP_REPLACE to remove non-native english characters [message #632455 is a reply to message #632436] Sat, 31 January 2015 12:31 Go to previous message
Solomon Yakobson
Messages: 3273
Registered: January 2010
Location: Connecticut, USA
Senior Member
Oracle regular expressions are multilingual. [:alpha:] is multilingual and will cover é and Ž ([A-Za-z] will not):

SQL> select '[' || regexp_replace('žolė','[[:alpha:]]') || ']','[' || regexp_replace('žolė','[a-z]') || ']' from dual;

'['||REGEXP_REPLACE('ŽOLE','[[:ALPHA:]]')||']'       '['||REGEXP_REPLACE('ŽOLE','[A-Z]')||']'       
---------------------------------------------------- ---------------------------------------------- 
[]                                                   [žė]                                            
 
SQL>


I believe your issue in bad pattern. Pattern should be [[:alpha]], not [:alpha]:

SQL> select '[' || regexp_replace('žolė','[:alpha:]') || ']','[' || regexp_replace('žolė','[a-z]') || ']' from dual;

'['||REGEXP_REPLACE('ŽOLE','[[:ALPHA:]]')||']'       '['||REGEXP_REPLACE('ŽOLE','[A-Z]')||']'       
---------------------------------------------------- ---------------------------------------------- 
[žoė]                                                   [žė]                                            
 
SQL>


SY.
Previous Topic: SQL Loader ; SQL
Next Topic: Quicksort - problem with null values
Goto Forum:
  


Current Time: Thu Apr 25 18:31:31 CDT 2024