From: jocave@my-deja.com
Newsgroups: comp.databases.oracle.server
Subject: Re: Oracle ODBC / UNICODE Non Conformance
Date: Tue, 10 Oct 2000 22:29:46 GMT
Organization: Deja.com - Before you buy.
Lines: 68
Message-ID: <8s058i$tqp$1@nnrp1.deja.com>
References: <8rt64a$fih$1@nnrp1.deja.com>
NNTP-Posting-Host: 205.227.43.12
X-Article-Creation-Date: Tue Oct 10 22:29:46 2000 GMT
X-Http-User-Agent: Mozilla/4.7 [en] (WinNT; U)
X-Http-Proxy: 1.0 x62.deja.com:80 (Squid/1.1.22) for client 205.227.43.12
X-MyDeja-Info: XMYDJUIDjocave


In article <8rt64a$fih$1@nnrp1.deja.com>,
  David Murphy <DavidMurphy@j-netdirect.com> wrote:
> Hi all - I have come across what appears to be different behaviour
> between the Oracle ODBC drivers and the MS ODBC drivers for MS SQL and
> Acccess.
>
> We now bind all input parameter and output column character data with
> SQL_C_WCHAR since we handle all char data as double byte.
>
> With SQL Server if the underlying column is CHAR (not NCHAR) the ODBC
> driver (or manager?) converts the data to single byte (since CHAR is
> single byte characters) and stores the data correctly.
> When reading the data back ODBC converts the data back to double byte
> and everything works correctly.
>
> Same with Access although Access has only one char type which I assume
> is alwasy double byte. IE it has no CHAR/NCHAR distinction.
>
> According to the ODBC doc the above behaviour is correct and managed
> automatically by ODBC (Driver + manager)
>
> Now with Oracle none of this works. If the underlying column is char
 we
> get every 2nd byte  0 and loose half the remaining string.
>
> When reading the data back the same problem.
>
> If the underlying column is NCHAR then we just get the error
 'character
> set mismatch' after SQLExecute.
>
> If anyone has any insight into these issues I would appreciate it..

I'll take a whack at it...

First, a column declared 'CHAR' or 'VARCHAR2' in Oracle may hold either
Unicode or ANSI data (or both if you were so inclined).  Thus, when you
do all the binding as SQL_C_WCHAR, our driver assumes you want to insert
the data in it's Unicode form.  The UCS-2 encoding we use for Unicode
data will use two-byte characters with the second byte 0x00 if the data
is ANSI.  If the database were set up as a UTF-8 database, the Unicode
data would be inserted as UTF-8 data (which is a different Unicode
encoding scheme that uses variable-length characters and should tend to
save on space).

One thing to be careful of here that declaring a column CHAR(1000)
indicates that you want to reserve 1000 *bytes* for the column.  If
you're inserting two-byte characters, you can only insert 500.  If
you're inserting variable-byte characters, you'll be able to insert a
variable number (somewhere between 333 and 1000 for UTF-8 data).

I believe that when Oracle 9i is released, it will allow you to declare
a column explicitly Unicode.  I haven't been following the developments
on this front terribly closely, so I don't know exactly what is coming
nor exactly how it will affect the ODBC driver.

I'd suggest that you change your program to bind the columns as
SQL_C_CHAR if you want ANSI data to be inserted.

Justin Cave
Oracle ODBC Development

Opinions expressed herein are my own and may not reflect those of Oracle
Corporation.


Sent via Deja.com http://www.deja.com/
Before you buy.