Home » SQL & PL/SQL » SQL & PL/SQL » Handling Special Characters in oracle
Handling Special Characters in oracle [message #219478] Wed, 14 February 2007 10:46 Go to next message
vikasvirgo169
Messages: 1
Registered: February 2007
Junior Member
Hi,
I am facing problems in replacing some special characters which got inserted in the my oracle tables.These typically got inserted for symbols like copyright, registered etc. and the new ones include ¿ etc or ascii equivalents like /00BF, /00AE and /FFFD. But i am unable to replace them back to original character(copyright ones) by using scripts.Is character set encoding coming into picture? Can anybody help? I can provide more details if required.
Re: Handling Special Characters in oracle [message #220462 is a reply to message #219478] Tue, 20 February 2007 14:59 Go to previous message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
Be aware that what you see on the screen is often not accurate compared to what's really stored in the DB becuase you client NLS characterset and/or OS codepage aren't necessarily the same as your database. Use "select dump(column_name, 1010)..." or dump(column_name, 1016) to see qhat's really stored in the database.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production

SQL> -- make sure we can store UTF8
SQL> select parameter, value, banner
  2    from v$nls_parameters a, v$version
  3    where parameter like 'NLS_CHARACTERSET%'
  4    and banner like '%Enterprise%';

PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
BANNER
----------------------------------------------------------------
NLS_CHARACTERSET
AL32UTF8
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production

SQL> CREATE OR REPLACE FUNCTION ascii_only (p_txt IN VARCHAR2)
  2     RETURN VARCHAR2
  3  IS
  4     v_tmp     VARCHAR2 (32767);
  5     v_clean   VARCHAR2 (32767);
  6     v_char    VARCHAR2 (3 BYTE);
  7  BEGIN
  8     FOR i IN 1 .. LENGTH (p_txt) LOOP
  9        v_char := SUBSTR (p_txt, i, 1);
 10
 11        -- choose your range of valid characters
 12        -- Ascii only looks at first BYTE of character
 13        -- (note sure if this is a reliable way to detect UTF8 multi-byte characters)
 14        -- (but it seems to work ok on my 10000 test records)
 15        IF    (ASCII (v_char) BETWEEN 32 AND 127)
 16           OR (ASCII (v_char) IN (9, 10, 13)) THEN
 17           v_clean := v_clean || v_char;
 18        END IF;
 19     END LOOP;
 20
 21     IF LENGTH (v_clean) != LENGTH (p_txt) THEN
 22        DBMS_OUTPUT.put_line ('removed '||TO_CHAR(LENGTH(p_txt) - LENGTH(v_clean))||' characters');
 23     END IF;
 24
 25     RETURN v_clean;
 26  END;
 27  /

Function created.


SQL> create table utf8_data(col1 varchar2(20));

Table created.

SQL>
SQL> -- http://www.unicode.org/Public/3.0-Update/NamesList-3.0.0.txt
SQL> -- Euro is U+20AC, Small Greek Gamma U+03B3
SQL> insert into utf8_data values ('hello '||unistr('\20AC')||unistr('\03B3')||chr(10)||'world');

1 row created.

SQL>
SQL> set serveroutput on size 1000000
SQL>
SQL> select ascii_only(col1) from utf8_data;

ASCII_ONLY(COL1)
--------------------------------------------------------------------------------

hello
world


SQL>
SQL> begin
  2   DBMS_OUTPUT.put_line ( 'just flushing buffer...');
  3  end;
  4  /
removed 2 characters
just flushing buffer...

PL/SQL procedure successfully completed.

Previous Topic: Reach the highest PLSQL performances....
Next Topic: How should I re-write my procedure to make the update work?
Goto Forum:
  


Current Time: Mon Dec 09 19:47:29 CST 2024