Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: NLS_LANG setting for 10G client v. 8.1.7 client?

Re: NLS_LANG setting for 10G client v. 8.1.7 client?

From: Andy Hassall <andy_at_andyh.co.uk>
Date: Wed, 06 Apr 2005 22:50:55 +0100
Message-ID: <ftl8511a9lucl7hmph5fuqh9cjsdmnbu69@4ax.com>


On 6 Apr 2005 11:33:09 -0700, "Script Me" <script_me_at_yahoo.com> wrote:

>Help. [cross posted to www.servers.unix.*]
>
>NOTE: the oracle client on my new webserver appears to be the 10G
>client. on the old, it is 8.1.7. the actual db server is 10G.
>
>Problem: My dba is away and we are having a problem with user entered
>text that was cut and pasted from a word doc into a web form then

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>submitted to webserver, and finally inserted into oracle table. All
>fine and well so far.

 Trouble brewing!

>On display some special extended characterset
>characters appear as ? like the emdash and single quotes. On apache
>1.3.x we set the NLS_LANG like this:
>
>SetEnv NLS_LANG american_america.WE8ISO8859P1

 ISO-8859-1 contains neither emdash nor "left" and "right" single quotes.

>Recently we moved to apache 2 and set the same NLS_SETTING in Apache
>2's httpd.conf like this:
>
>SetEnv NLS_LANG american_america.WE8ISO8859P1
>
>But the problem persists under Apache 2.
>
>Is there some issue with SetEnv in apache 2 or am I missing a place
>where this variable needs to be set?
>
>When i say print $_SERVER["NLS_LANG"] in php i get:
>
>american_america.WE8ISO8859P1
>
>So it appears to be set...
>
>Any help would be MUCH appreciated!!

 That's too late in the chain. The purpose of NLS_LANG is to inform Oracle of the encoding that the client is using; Oracle trusts the client that all data passed in is in this encoding, and in return encodes any data received from the database in this encoding.

 Since ISO-8859-1 doesn't contain emdash, you cannot possibly be using the correct code for this character, so it's corruption the moment it hits the Oracle client.

 Since your database is in ISO-8859-1 it cannot store emdash either (unless you use the nvarchar2 types instead of varchar2 and so end up in Unicode), so the option of using UTF-8 or similar for your NLS_LANG is out.

 You need to:

(a) Be absolutely sure what encoding you're receiving from the web form. If you're receiving what you think is emdash from copied Word documents you're probably getting Windows Codepage 1252 - it's not ISO-8859-1. Did you send a Content-type header with a charset attribute? (Did the browser take any notice?)

(b) Only actually send ISO-8859-1 to Oracle, since that's what you've said you're going to send. This may involve you converting unrepresented characters, such as emdash into normal dashes, left+right quotes into single quotes etc.

 If you change NLS_LANG to the appropriate setting for Windows codepage 1252 (WE8MSWIN something) to try and get Oracle to do that for you, then Oracle will probably still (entirely correctly) use the substitution character for emdash, since it doesn't exist in ISO-8859-1.

-- 
Andy Hassall / <andy@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool
Received on Wed Apr 06 2005 - 16:50:55 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US