Home » SQL & PL/SQL » SQL & PL/SQL » Unicode support in the where clause of DELETE statement
Unicode support in the where clause of DELETE statement [message #268158] Mon, 17 September 2007 17:33 Go to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
In my application, they are inserting special characters (German or French lettters) as tag name and they are added into the table using addUnicodeParamURL in which it translates
special character to their unicode values and insert into the database (Am not aware of how they are doing this).

But, for deletion prospective we are having the statement like
delete from table_name where tag_name = parameter_value;

Here if the parameter comes like Umlaut '', the database didnt understand and gives the error,

SQL*Plus: Release 9.0.1.0.1 - Production on Tue Sep 18 02:53:27 2007

(c) Copyright 2001 Oracle Corporation. All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production

SQL> Delete from table_name where tag_name = 'Umlaut ''';
ERROR:
ORA-01756: quoted string not properly terminated.

Now its needed to support unicode so that it can delete the records if German or French character
comes in the where clause.

In the NLS_INSTANCE_PARAMETERS and NLS_sESSION_PARAMETERS, we are having NLS_LANGUAGE = AMERICAN,
NLS_TERRITORY = AMERICA AND NLS_NCHAR_CONV_EXCP = FALSE


In the V$NLS_PARAMETERS, we are having NLS_NCHAR_CHARACTERSET = AL16UTF16,NLS_CHARACTERSET = WE8ISO8859P1,
NLS_LANGUAGE = AMERICAN,NLS_TERRITORY = AMERICA AND NLS_NCHAR_CONV_EXCP = FALSE

SQL>@.[%NLS_LANG%].

Gives something like:

Unable to open file.[AMERICAN_AMERICA.WE8ISO8859P1].


Does it mean the parameter NLS_LANG is also not set in the registry??

I added a environment variable as NLS_LANG with the variable value AMERICAN_AMERICA.UTF8 and restarted
the machine to overwrite the NLS_LANG in the client to support Unicode. But it didnt works.

Pls expert help on this...
Re: Unicode support in the where clause of DELETE statement [message #268160 is a reply to message #268158] Mon, 17 September 2007 18:14 Go to previous messageGo to next message
BlackSwan
Messages: 25050
Registered: January 2009
Location: SoCal
Senior Member
If you REALLY need answers, they can be found here if you are willing to RTFM.
http://download.oracle.com/docs/cd/B10501_01/server.920/a96529/toc.htm
Re: Unicode support in the where clause of DELETE statement [message #268166 is a reply to message #268160] Mon, 17 September 2007 18:31 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
Hi Ana,

I had already read this document and i didnt get the clear idea to proceed.

In the document the solutions given like:
To create a database with the AL32UTF8 character set, use the CREATE DATABASE statement and include the CHARACTER SET AL32UTF8 clause.
or
Creating a Database with a National Character Set

An alternative to storing Unicode data in the database is to use the SQL NCHAR datatypes (NCHAR, NVARCHAR, NCLOB).

But, database exists there already, then what is the need to create a database with characterset?? Setting NLS_LANG
is enough right??


Also its given as, "Oracle9i provides two solutions to store Unicode characters in the database:
a Unicode database solution and a Unicode datatype solution.
After you select the Unicode database solution, the Unicode datatype solution or a combination of both,
determine the character set to be used in the Unicode database or the Unicode datatype."
--I cant understand this..Can u explain??

Hope am looking for the second solution..

what are multiexers??

how to execute these..

ctx_ddl.create_preference('english_lexer', 'basic_lexer');
ctx_ddl.set_attribute('english_lexer','index_themes','yes');
ctx_ddl.create_preference('german_lexer', 'basic_lexer');
ctx_ddl.set_attribute('german_lexer','composite','german');
ctx_ddl.set_attribute('german_lexer','alternate_spelling','german');
ctx_ddl.set_attribute('german_lexer','mixed_case','yes');
ctx_ddl.create_preference('japanese_lexer', 'JAPANESE_VGRAM_LEXER');
ctx_ddl.create_preference('global_lexer','multi_lexer');

Will Oracle Intermedia Text support multilingual?? How it supports and what we have do for tis..we have to install
Intermedia Text seperately?? Am able to understand and proceed with the document..

Re: Unicode support in the where clause of DELETE statement [message #268168 is a reply to message #268160] Mon, 17 September 2007 18:44 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
In the document the solution given as below

Unicode Solution with Unicode Datatypes
A European company that runs its applications mainly on Windows platforms wants to add new Windows applications written in Visual C/C++. The new applications will use the existing database to support Japanese and Chinese customer names. The company currently has the following system configuration:
• The existing database has a database character set of WE8ISO8859P1.
• All character data in the existing database is composed of Western European characters.
• The database is around 50 GB.
A typical solution is take the following actions:
• Use NCHAR and NVARCHAR2 datatypes to store Unicode characters
• Keep WE8ISO8859P1 as the database character set
• Use AL16UTF16 as the national character set
The reasons for this solution are:
• Migrating the existing database to a Unicode database required data conversion because the database character set is WE8ISO8859P1 (a Latin-1 character set), which is not a subset of UTF8. As a result, there would be some overhead in converting the data to UTF8.
• The additional languages are supported in new applications only. They do not depend on the existing applications or schemas. It is simpler to use the Unicode datatype in the new schema and keep the existing schemas unchanged.
• Only customer name columns require Unicode support. Using a single NCHAR column meets the customer's requirements without migrating the entire database.
• Because the languages to be supported are mostly Asian languages, AL16UTF16 should be used as the national character set so that disk space is used more efficiently.
• The lengths of the SQL NCHAR datatypes are defined as number of characters. This is the same as the way they are treated when using wchar_t strings in Windows C/C++ programs. This reduces programming complexity.
• Existing applications using the existing schemas are unaffected.

This example matches perfectly with our application and as like the solution given we have WE8ISO8859P1 as the database character set and AL16UTF16 as the national character set in the nls_database_parameters then why its not working?? any idea???

what about "NCHAR and NVARCHAR2 datatypes to store Unicode characters"???
Re: Unicode support in the where clause of DELETE statement [message #268169 is a reply to message #268158] Mon, 17 September 2007 18:47 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
AS NLS_SAVED_NCHAR_CS is WE8ISO8859P1 ITS TAKING THAT VALUE INSTEAD OF NLS_NCHAR_NCHAR_CS??
Re: Unicode support in the where clause of DELETE statement [message #268170 is a reply to message #268158] Mon, 17 September 2007 18:51 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
But our application is java on Oracle 9i.
Re: Unicode support in the where clause of DELETE statement [message #268171 is a reply to message #268158] Mon, 17 September 2007 18:53 Go to previous messageGo to next message
BlackSwan
Messages: 25050
Registered: January 2009
Location: SoCal
Senior Member
I am NOT an expert on "GLOBALIZATION".
I created a new DB with UTF-8 & used DBLINK to transfer data into it.
Re: Unicode support in the where clause of DELETE statement [message #268172 is a reply to message #268166] Mon, 17 September 2007 18:54 Go to previous messageGo to next message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
Your database characterset is WE8ISO8859P1 (stands for Western European ISO 8859 point 1) and it supports the '' character. Unicode is a much larger characterset that is supposed to be a superset of almost everything else. It supports more characters by expanding above the use of 1 byte per character. UTF8 is a common encoding of Unicode used in Oracle. Unicode assigns a code point (ID number) to each character, but doesn't define what bytes represent that character. The UTF8 is one of several Unicode encoding schemes and defines what bytes are used to store each character.

http://en.wikipedia.org/wiki/ISO_8859-1

Your error is caused by trying to use embedded quotes.

Try something like this:
select tag_name, dump(tag_name, 1010), dump(tag_name, 1016) from table_name where tag_name like '%%';

[Updated on: Mon, 17 September 2007 18:58]

Report message to a moderator

Re: Unicode support in the where clause of DELETE statement [message #268173 is a reply to message #268172] Mon, 17 September 2007 19:17 Go to previous messageGo to next message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
I've just seen the subsequent replies above.
ISO-1 (8859-1) IS a subset of UTF8.

Regular 7 bit ASCII characters will still be stored in 1 byte. Extended characters will be stored in mutilple bytes. UTF-8 doesn't use the range 80-FF.

Storing all your data in UTF8 makes the environment setup trivial - I'm not so sure about NCHAR support. Make sure your drivers support both similtaneously.

The only way to calculate how much more space would be used by converting your database to UTF-8 would be to profile your data to see how often non-7bit-ascii characters occur. An easier way is to just try the conversion on a subset of data. UTF-8 (AL32UTF8) is the default database characterset on all recent Unix database I've installed.

AL32UTF8 includes fixes for errors found in the original UTF8 used in 8i.

Re: Unicode support in the where clause of DELETE statement [message #268174 is a reply to message #268158] Mon, 17 September 2007 19:17 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
select tag_name, dump(tag_name, 1010), dump(tag_name, 1016) from table_name where tag_name like '%%';

returns null values..

Can u tell me the idea behind dump and these numbers 1010,1016.. i didnt understand

Re: Unicode support in the where clause of DELETE statement [message #268175 is a reply to message #268158] Mon, 17 September 2007 19:23 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
i created a temp table..

CREATE TABLE occ_tag_Group_temp(Tag_name NVARCHAR2(20));

and tried to insert the same value

INSERT INTO occ_Tag_group_temp VALUES('aut ''');

i got the same error...
ORA-01756: quoted string not properly terminated
Re: Unicode support in the where clause of DELETE statement [message #268176 is a reply to message #268158] Mon, 17 September 2007 19:27 Go to previous messageGo to next message
BlackSwan
Messages: 25050
Registered: January 2009
Location: SoCal
Senior Member
SELECT ASCIISTR('ABCDE') FROM DUAL;

ASCIISTR('
----------
AB\00C4CDE


Honestly I am NOT sure what the actual problem is, how to reproduce it or know when it has been solved.
Re: Unicode support in the where clause of DELETE statement [message #268178 is a reply to message #268176] Mon, 17 September 2007 19:39 Go to previous messageGo to next message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
You often can't rely on what you see on your screen to determine what the real stored character is becuase your client codepage setting, font, client characterset capability and ORACLE NLS setting all have an effect of what you see. dump allows you to see the decimal (1010) or hex (1016) representation of the bytes stored. You chech those against your characterset encoding to make sure the byte(s) stored correctly represent your character.

My 9i database uses AL16UTF16 as the NCHAR characterset.
SQL*Plus: Release 9.2.0.6.0 - Production on Mon Sep 17 17:32:37 2007

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.8.0 - Production


Table dropped.

SQL> 
SQL> CREATE TABLE occ_tag_Group_temp(Tag_name NVARCHAR2(20));

Table created.

SQL> 
SQL> INSERT INTO occ_Tag_group_temp VALUES('aut ''');
INSERT INTO occ_Tag_group_temp VALUES('aut ''')
                                            *
ERROR at line 1:
ORA-00917: missing comma


SQL> 
SQL> INSERT INTO occ_Tag_group_temp VALUES('aut ''''');

1 row created.

SQL> 
SQL> select tag_name, dump(tag_name, 1010) dec_dump, dump(tag_name, 1016) hex_dump from occ_Tag_grou

TAG_NAME
--------------------
DEC_DUMP
--------------------------------------------------------------------------------
HEX_DUMP
--------------------------------------------------------------------------------
aut ''
Typ=1 Len=14 CharacterSet=AL16UTF16: 0,97,0,117,0,116,0,32,0,39,0,214,0,39
Typ=1 Len=14 CharacterSet=AL16UTF16: 0,61,0,75,0,74,0,20,0,27,0,d6,0,27


SQL> 
SQL> 


Now using database characterset (using VARCHAR2 datatype not NVARCHAR2 dataype):
SQL> CREATE TABLE occ_tag_Group_temp(Tag_name VARCHAR2(20));

Table created.

SQL> 
SQL> INSERT INTO occ_Tag_group_temp VALUES('aut ''');
INSERT INTO occ_Tag_group_temp VALUES('aut ''')
                                            *
ERROR at line 1:
ORA-00917: missing comma


SQL> 
SQL> INSERT INTO occ_Tag_group_temp VALUES('aut ''''');

1 row created.

SQL> 
SQL> select tag_name, dump(tag_name, 1010) dec_dump, dump(tag_name, 1016) hex_dump from occ_Tag_grou

TAG_NAME
--------------------
DEC_DUMP
--------------------------------------------------------------------------------
HEX_DUMP
--------------------------------------------------------------------------------
aut ''
Typ=1 Len=8 CharacterSet=AL32UTF8: 97,117,116,32,39,195,150,39
Typ=1 Len=8 CharacterSet=AL32UTF8: 61,75,74,20,27,c3,96,27


Re: Unicode support in the where clause of DELETE statement [message #268179 is a reply to message #268178] Mon, 17 September 2007 19:46 Go to previous messageGo to next message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
As our friend "ana" shows, using asciistr and unistr is a way to specify characters by their Unicode code-points.

Using a database characterset of AL32UTF8, you can see how a euro is stored.

--Euro is codepoint U+20AC
-- the old way to specify a Euro i. CHR(14844588) assuming you don't specify it litterally.
SQL> insert into x values (unistr('\20AC'));

1 row created.

SQL> select dump(col1) from x;

DUMP(COL1)
---------------------------------------------

Typ=1 Len=3: 226,130,172


-- asciistr() is the reverse of unistr()
SQL> select asciistr(col1) from x;

ASCIISTR(COL1)
-----------------------------------
\20AC
Re: Unicode support in the where clause of DELETE statement [message #268180 is a reply to message #268158] Mon, 17 September 2007 20:04 Go to previous messageGo to next message
sharan_it
Messages: 140
Registered: July 2005
Location: Chennai
Senior Member
Thanku Ana..

While i execute this query to retrieve taht particular record...

SELECT tag_group_code, DUMP(tag_group_code, 1010) dec_dump, DUMP(tag_group_code, 1016) hex_dump FROM occ_Tag_group
WHERE total_count = 23;

i have noticed its using still WE8ISO8859P1.
Re: Unicode support in the where clause of DELETE statement [message #268181 is a reply to message #268158] Mon, 17 September 2007 21:01 Go to previous message
BlackSwan
Messages: 25050
Registered: January 2009
Location: SoCal
Senior Member
>While i execute this query to retrieve taht particular record...
>SELECT tag_group_code, DUMP(tag_group_code, 1010) dec_dump, DUMP(tag_group_code, 1016) hex_dump FROM occ_Tag_group WHERE total_count = 23;

The sun rises in the East

>i have noticed its using still WE8ISO8859P1.

Objects was fall in the direction of down.


>>Honestly I am NOT sure what the actual problem is, how to reproduce it or know when it has been solved.

I still don't know what EXACT problem needs to be solved.
The problem seems to involve non-ASCII characters.
It remains to be seen where the problem is technical or other.

I do NOT know how to reproduce this illusive problem.

I would not recognize a solution unless or until somebody declares whatever the fix may be in a flashing font.

[Updated on: Mon, 17 September 2007 21:01] by Moderator

Report message to a moderator

Previous Topic: capturing milliseconds in oracle8
Next Topic: Request Size
Goto Forum:
  


Current Time: Sat Dec 10 16:34:25 CST 2016

Total time taken to generate the page: 0.12394 seconds