Handling Special Characters in oracle [message #219478] |
Wed, 14 February 2007 10:46 |
vikasvirgo169
Messages: 1 Registered: February 2007
|
Junior Member |
|
|
Hi,
I am facing problems in replacing some special characters which got inserted in the my oracle tables.These typically got inserted for symbols like copyright, registered etc. and the new ones include ¿ etc or ascii equivalents like /00BF, /00AE and /FFFD. But i am unable to replace them back to original character(copyright ones) by using scripts.Is character set encoding coming into picture? Can anybody help? I can provide more details if required.
|
|
|
Re: Handling Special Characters in oracle [message #220462 is a reply to message #219478] |
Tue, 20 February 2007 14:59 |
andrew again
Messages: 2577 Registered: March 2000
|
Senior Member |
|
|
Be aware that what you see on the screen is often not accurate compared to what's really stored in the DB becuase you client NLS characterset and/or OS codepage aren't necessarily the same as your database. Use "select dump(column_name, 1010)..." or dump(column_name, 1016) to see qhat's really stored in the database.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.6.0 - Production
SQL> -- make sure we can store UTF8
SQL> select parameter, value, banner
2 from v$nls_parameters a, v$version
3 where parameter like 'NLS_CHARACTERSET%'
4 and banner like '%Enterprise%';
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
BANNER
----------------------------------------------------------------
NLS_CHARACTERSET
AL32UTF8
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
SQL> CREATE OR REPLACE FUNCTION ascii_only (p_txt IN VARCHAR2)
2 RETURN VARCHAR2
3 IS
4 v_tmp VARCHAR2 (32767);
5 v_clean VARCHAR2 (32767);
6 v_char VARCHAR2 (3 BYTE);
7 BEGIN
8 FOR i IN 1 .. LENGTH (p_txt) LOOP
9 v_char := SUBSTR (p_txt, i, 1);
10
11 -- choose your range of valid characters
12 -- Ascii only looks at first BYTE of character
13 -- (note sure if this is a reliable way to detect UTF8 multi-byte characters)
14 -- (but it seems to work ok on my 10000 test records)
15 IF (ASCII (v_char) BETWEEN 32 AND 127)
16 OR (ASCII (v_char) IN (9, 10, 13)) THEN
17 v_clean := v_clean || v_char;
18 END IF;
19 END LOOP;
20
21 IF LENGTH (v_clean) != LENGTH (p_txt) THEN
22 DBMS_OUTPUT.put_line ('removed '||TO_CHAR(LENGTH(p_txt) - LENGTH(v_clean))||' characters');
23 END IF;
24
25 RETURN v_clean;
26 END;
27 /
Function created.
SQL> create table utf8_data(col1 varchar2(20));
Table created.
SQL>
SQL> -- http://www.unicode.org/Public/3.0-Update/NamesList-3.0.0.txt
SQL> -- Euro is U+20AC, Small Greek Gamma U+03B3
SQL> insert into utf8_data values ('hello '||unistr('\20AC')||unistr('\03B3')||chr(10)||'world');
1 row created.
SQL>
SQL> set serveroutput on size 1000000
SQL>
SQL> select ascii_only(col1) from utf8_data;
ASCII_ONLY(COL1)
--------------------------------------------------------------------------------
hello
world
SQL>
SQL> begin
2 DBMS_OUTPUT.put_line ( 'just flushing buffer...');
3 end;
4 /
removed 2 characters
just flushing buffer...
PL/SQL procedure successfully completed.
|
|
|