Re: Oracle ODBC / UNICODE Non Conformance

From: David Murphy <DavidMurphy_at_j-netdirect.com>
Date: Wed, 11 Oct 2000 00:08:32 GMT
Message-ID: <8s0b1s$2kt$1@nnrp1.deja.com>

Joe - thanks for the explanation - this is good info..

For simplicity lets assume that all character data handled by the app is UNICODE double byte. The app needs to deal with non ANSI data always. Therefore all binding must be done with SQL_C_WCHAR

We need to insert into columns (either CHAR or NCHAR) using SQLBindParam. We need to read data (CHAR & NCHAR) using SQLBindColumn. Trouble is we dont know the column type before the insert since the underlying db column types are unknown to us due to the nature of the app.

My first question is what is the difference between NCHAR and CHAR in Oracle? I assumed NCHAR was the only one allowing double byte chars like UNICODE
I assume that the ODBC driver hides our app from having to know the encoding of char data in the database - is that correct? So what is the format (encoding?) of character data that our app must provide when we bind (params or columns) with SQL_C_WCHAR?. I assume the UCS-2 format you descibed - correct?
I guess in summary what we want is that we can write char data bound with SQL_C_WCHAR and read it back bound with SQL_C_WCHAR and get the same data back no matter whether the underlying column is CHAR or NCHAR. Also no matter the database encoding scheme. Is this possible?
What should the SQL column type be for the SQLBindParameter for CHAR and NCHAR? SQL_CHAR or SQL_UNICODE_CHAR?
What does 'character set mismatch' mean when trying to update NCHAR data?
On an unrelated matter - can we tell clients to use the latest ODBC driver for 8.1.5 on an 8.0.x database since it fixed a problem we needed fixed for them.

Thanks again
David M
NetDirect

In article <8s058i$tqp$1_at_nnrp1.deja.com>, jocave_at_my-deja.com wrote:
> In article <8rt64a$fih$1_at_nnrp1.deja.com>,
> David Murphy <DavidMurphy_at_j-netdirect.com> wrote:
> > Hi all - I have come across what appears to be different behaviour
> > between the Oracle ODBC drivers and the MS ODBC drivers for MS SQL
and
> > Acccess.
> >
> > We now bind all input parameter and output column character data
with
> > SQL_C_WCHAR since we handle all char data as double byte.
> >
> > With SQL Server if the underlying column is CHAR (not NCHAR) the
ODBC
> > driver (or manager?) converts the data to single byte (since CHAR is
> > single byte characters) and stores the data correctly.
> > When reading the data back ODBC converts the data back to double
byte
> > and everything works correctly.
> >
> > Same with Access although Access has only one char type which I
assume
> > is alwasy double byte. IE it has no CHAR/NCHAR distinction.
> >
> > According to the ODBC doc the above behaviour is correct and managed
> > automatically by ODBC (Driver + manager)
> >
> > Now with Oracle none of this works. If the underlying column is char
we
> > get every 2nd byte 0 and loose half the remaining string.
> >
> > When reading the data back the same problem.
> >
> > If the underlying column is NCHAR then we just get the error
'character
> > set mismatch' after SQLExecute.
> >
> > If anyone has any insight into these issues I would appreciate it..
>
> I'll take a whack at it...
>
> First, a column declared 'CHAR' or 'VARCHAR2' in Oracle may hold
either
> Unicode or ANSI data (or both if you were so inclined). Thus, when
you
> do all the binding as SQL_C_WCHAR, our driver assumes you want to
insert
> the data in it's Unicode form. The UCS-2 encoding we use for Unicode
> data will use two-byte characters with the second byte 0x00 if the
data
> is ANSI. If the database were set up as a UTF-8 database, the Unicode
> data would be inserted as UTF-8 data (which is a different Unicode
> encoding scheme that uses variable-length characters and should tend
to
> save on space).
>
> One thing to be careful of here that declaring a column CHAR(1000)
> indicates that you want to reserve 1000 *bytes* for the column. If
> you're inserting two-byte characters, you can only insert 500. If
> you're inserting variable-byte characters, you'll be able to insert a
> variable number (somewhere between 333 and 1000 for UTF-8 data).
>
> I believe that when Oracle 9i is released, it will allow you to
declare
> a column explicitly Unicode. I haven't been following the
developments
> on this front terribly closely, so I don't know exactly what is coming
> nor exactly how it will affect the ODBC driver.
>
> I'd suggest that you change your program to bind the columns as
> SQL_C_CHAR if you want ANSI data to be inserted.
>
> Justin Cave
> Oracle ODBC Development
>
> Opinions expressed herein are my own and may not reflect those of
Oracle
> Corporation.
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
>

Sent via Deja.com http://www.deja.com/
Before you buy. Received on Tue Oct 10 2000 - 19:08:32 CDT