Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Database with different character set problems

Re: Database with different character set problems

From: Frank van Bortel <fvanbortel_at_netscape.net>
Date: Sat, 14 Feb 2004 21:11:03 +0100
Message-ID: <c0luom$rki$1@news3.tilbu1.nb.home.nl>


David wrote:
> yes, i can only retrieve part of the string. For example, assume in
> testing2, the whole record is ABCDE. I can only retrieve ABC with the
> "select * from testing_chinese_at_testing2" in database testing1. I think this
> is due to the character set conversion. This is because in testing2,
> character set is ZHT16BIG5, and each character is represented by 2 bytes (so
> 5 characters need 10 bytes storage). In testing1, character set is UTF8, and
> each chinese character is represented by 3 bytes. (so if 10 bytes, only
> store 3 characters) I don't know whether the above explanation is correct.
> And also want to know is it possible to retrieve all characters in testing1
> with the same SQL statement?
>
> THanks,
> David
>
>
> "Frank van Bortel" <fvanbortel_at_netscape.net> wrote in message
> news:c0je4r$tgm$1_at_news1.tilbu1.nb.home.nl...
>

>>David wrote:
>>
>>
>>>Hi all,
>>>
>>>  I have two database (Oracle 8.1.7 in Solaris platform), database

>
> testing1
>
>>>is UTF8 character set, database testing2 is ZHT16BIG5 character set.
>>>
>>>  When I create a database link from testing1 to testing2, and execute
>>>(testing_chinese is a table with varchar2(10)):
>>>
>>>select * from testing_chinese_at_testing1;
>>>
>>>   I found that only first three characters is selected in database
>>>testing1. I know that this is because in ZHT16BIG5, each character is
>>>represented by 2 byte. However, in UTF8, each character (chinese) is
>>>represented by 3 bytes. So only the first characters is selected. The

>
> above
>
>>>select statement is issued in SQL*Plus on Windows platform.
>>>
>>>  How can I retrieve all chinese characters in database testing1?
>>>
>>>  On the other hand, I found that if database testing1 is 9i (character

>
> set
>
>>>is UTF8), I can retrieve all characters from database testing2 (testing2

>
> is
>
>>>still 8.1.7, ZHT16BIG5). I don't understand why 9i can retrieve all
>>>characters. Is the UTF8 in 9i is different from UTF in 8i?
>>>
>>>
>>>Thanks,
>>>David
>>>
>>>
>>>
>>
>>AL32UTF8 is the preferred 9i one - did not exist in 8i.
>>So - yes they are different.
>>
>>You say the columns are varchar2(10) - I take it you mean
>>varchar2(10 BYTE), as you leave out the specification.
>>Use varchar2(10 CHAR) instead - it will ensure you can store
>>10 Characters, whether they are stored as double or triple
>>byte characters.
>>Within the limit of a varchar column, that is, i.e. 4000 bytes
>>max.
>>
>>BTW, what do you mean by the first 3 characters - I don't
>>quite understand the problem, other that that character sets
>>seems to be an issue of some sort.
>>Are you saying the client testing1 only retrieves a part
>>of the strings?
>>--
>>
>>Regards,
>>Frank van Bortel
>>

>
>
>

Sounds too daft to be true, but could it be:

- testing1 queries testing2 table: varcahr2(10 BYTE).
- testing1 retrieves all from the column: all 10 bytes
- as testing1 is in UTF8 mode, 3 characters (and a bit)

   fill up the 10 BYTES, as they get retrieved as 3-byte    characters.

Should not happen. Sounds like a major bug.

Test it with a different character set setting on the testing1 side. Remeber, testing1 is client as well as server: if you just use sql on the testing1 sode, you should see the same behaviour (connect to testing2, as do a select * from testing_chinese; )

I cannot reproduce this behaviour with a single-byte client, an AL32UTF8 (9.2) database and double byte characters.
But you already said so. Your problem was with 8i.

I think your only solution is to upgrade the 8i, as 8i is desupported by the end of this year. I do not think Oracle will solve such a problem in 8i when it does not occur in 9i

Having said all that, and rereading, I begin to doubt the cause and action here: you talk alot about database versions. Are you just retrieving them, or storing as well?

If you store them in another characterset database, it all makes sens - and I'll go back to the first reaction: define your table with: varchar2(10 CHAR), and it will work

-- 

Regards,
Frank van Bortel
Received on Sat Feb 14 2004 - 14:11:03 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US