Return-Path: Delivered-To: 2-oracle-l@orafaq.com Received: (qmail 10887 invoked from network); 17 Jul 2007 18:24:31 -0500 Received: from freelists-180.iquest.net (HELO turing.freelists.org) (206.53.239.180) by 69.64.49.119 with SMTP; 17 Jul 2007 18:24:31 -0500 Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id A391F713E4F; Tue, 17 Jul 2007 19:22:20 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 31710-02-3; Tue, 17 Jul 2007 19:22:20 -0400 (EDT) Received: from turing (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 5D06B713F07; Tue, 17 Jul 2007 19:22:18 -0400 (EDT) Received: with ECARTIS (v1.0.0; list oracle-l); Tue, 17 Jul 2007 18:41:10 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id DE26D7135D1 for ; Tue, 17 Jul 2007 18:41:09 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 20738-06 for ; Tue, 17 Jul 2007 18:41:09 -0400 (EDT) Received: from mx.cqu.edu.au (eagle.cqu.edu.au [138.77.5.12]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id B8057713491 for ; Tue, 17 Jul 2007 18:41:05 -0400 (EDT) Received: from [127.0.0.1] (helo=eagle.cqu.edu.au) by mx.cqu.edu.au with esmtp (Exim 4.63) (envelope-from ) id 1IAvl7-0002kA-RR; Wed, 18 Jul 2007 08:43:09 +1000 Received: from UNIMAIL.staff.ad.cqu.edu.au (remus.staff.ad.cqu.edu.au [138.77.34.93]) by eagle.cqu.edu.au (8.13.8/8.13.8) with ESMTP id l6HMh9R6010547; Wed, 18 Jul 2007 08:43:09 +1000 (EST) (envelope-from p.mclarty@cqu.edu.au) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C7C8C3.D959E1F9" Subject: RE: character set confusion Date: Wed, 18 Jul 2007 08:43:08 +1000 Message-ID: <27AA2E9CA7A0C44283BC1E9B00086AA907427E53@UNIMAIL.staff.ad.cqu.edu.au> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: character set confusion References: From: "Peter McLarty" To: Cc: "oracle-l" X-archive-position: 50973 X-ecartis-version: Ecartis v1.0.0 Sender: oracle-l-bounce@freelists.org Errors-to: oracle-l-bounce@freelists.org X-original-sender: p.mclarty@cqu.edu.au Precedence: normal Reply-to: p.mclarty@cqu.edu.au List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: oracle-l X-List-ID: oracle-l List-subscribe: List-owner: List-post: List-archive: X-list: oracle-l X-Virus-Scanned: Debian amavisd-new at localhost.localdomain ------_=_NextPart_001_01C7C8C3.D959E1F9 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Be very careful and definitely run csscan. I worked with a customer in Thailand that had an applicaion where the front end was cobol. The database had USASCII7 as the characterset and for whatever reason the cobol had allowed the use of Thai Characters, which had become stored in the database. Oracle was storing in effect garbage, that only the cobol application could extract, and it was going to take some work to extract that data. It became apparent when the people were testing and ran a couple of queries in sqlplus and got garbasge in product descriptions or details lines. This data couldnt go into a datawharehouse and in fact it was useless in that database to anything except the cobol application. =20 The only solution I could see in this case would have been to extract that data using the cobol and then creating a new database with a suitable characterset and running the data back into it.=20 =20 I only hope its since been fixed but I doubt it. =20 Cheers =20 Peter =20 ________________________________ From: Robyn [mailto:robyn.sands@gmail.com]=20 Sent: Wednesday, 18 July 2007 07:07 AM To: Mark.Bobak@il.proquest.com Cc: mark.powell@eds.com; oracle-l Subject: Re: character set confusion Hello Marks ... I appreciate both your answers. UTF8 is being forced upon us by several third party applications, and I doubt we would have the option to change the character set unless we were willing to accept 'unsupported' status. So we will have at least three different systems on UTF8 in the very near future, obsolete or not.=20 My thinking is very similar to Mark B's thoughts - there are certain characters that are not going to store in our data warehouse eventually if we don't convert the character set to something more internationally friendly. I plan on running cscan but ... we have global manufacturing entering data in our systems, so even if there are no issues now it seems logical that we would eventually hit something that couldn't be stored in the warehouse. Which is why I don't understand Oracle's answer to my question.=20 And that brings me back to Mark P's input. Since the warehouse is the one place I can choose my character set, it seems that the best approach would be to convert the warehouse to AL32UTF8. Then we'd be prepared for the current requirements and for the first vendor we introduce that uses the newer standard. thank you both ... Robyn On 7/17/07, Bobak, Mark wrote:=20 Hmm...well, as I said, "I'm no expert on this subject..." so I'll be quiet now and see if more knowledgeable people have anything to say...;-) =20 -- Mark J. Bobak Senior Database Administrator, System & Product Technologies ProQuest 789 E. Eisenhower, Parkway, P.O. Box 1346 Ann Arbor MI 48106-1346 734.997.4059 or 800.521.0600 x 4059 mark.bobak@il.proquest.com =20 www.proquest.com =20 www.csa.com =20 =09 ProQuest...Start here.=20 =20 From: oracle-l-bounce@freelists.org [mailto: oracle-l-bounce@freelists.org ] On Behalf Of Powell, Mark D Sent: Tuesday, July 17, 2007 2:11 PM To: oracle-l Subject: RE: character set confusion =20 I thought UTF8 should be considered obsolete as it is not guaranteed to match the emerging standard and that AL32UTF8 was its replacement. =20 -- Mark D Powell --=20 Phone (313) 592-5148=20 =20 ------_=_NextPart_001_01C7C8C3.D959E1F9 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Be very careful and definitely run=20 csscan.
I worked with a customer in Thailand that had = an=20 applicaion where the front end was cobol. The database had USASCII7 as = the=20 characterset and for whatever reason the cobol had allowed the use of = Thai=20 Characters, which had become stored in the database. Oracle was = storing in=20 effect garbage, that only the cobol application could extract,  and = it was=20 going to take some work to extract that data. It became apparent when = the people=20 were testing and ran a couple of queries in sqlplus and got garbasge in = product=20 descriptions or details lines. This data couldnt go = into a=20 datawharehouse and in fact it was useless  in that = database to anything except the cobol=20 application.
 
The only solution I could see in this case = would have=20 been to extract that data using the cobol and then creating a new = database with=20 a suitable characterset and running the data back into it.=20
 
I only hope its since been fixed but I doubt=20 it.
 
Cheers
 
Peter

 

From: Robyn = [mailto:robyn.sands@gmail.com]=20
Sent: Wednesday, 18 July 2007 07:07 AM
To:=20 Mark.Bobak@il.proquest.com
Cc: mark.powell@eds.com;=20 oracle-l
Subject: Re: character set = confusion

Hello Marks ...

I appreciate both your answers.  = UTF8 is=20 being forced upon us by several third party applications, and I doubt we = would=20 have the option to change the character set unless we were willing to = accept=20 'unsupported' status.  So we will have at least three different = systems on=20 UTF8 in the very near future, obsolete or not.

My thinking is = very=20 similar to Mark B's thoughts - there are certain characters that are not = going=20 to store in our data warehouse eventually if we don't convert the = character set=20 to something more internationally friendly.  I plan on running = cscan but=20 ... we have global manufacturing entering data in our systems, so even = if there=20 are no issues now it seems logical that we would eventually hit = something that=20 couldn't be stored in the warehouse.  Which is why I don't = understand=20 Oracle's answer to my question.

And that brings me back to Mark = P's=20 input.  Since the warehouse is the one place I can choose my = character set,=20 it seems that the best approach would be to convert the warehouse to = AL32UTF8. =20 Then we'd be prepared for the current requirements and for the first = vendor we=20 introduce that uses the newer standard.

thank you both ...=20 Robyn




On 7/17/07, Bobak,=20 Mark <Mark.Bobak@il.proquest.com= >=20 wrote:

Hmm…well, as I said,=20 "I'm no expert on this subject…" so I'll be quiet now and see if = more=20 knowledgeable people have anything to say…;-)

 

--
Mark J.=20 Bobak

Senior Database = Administrator,=20 System & Product Technologies
ProQuest
789 E. Eisenhower, = Parkway,=20 P.O. Box 1346
Ann Arbor MI=20 48106-1346
734.997.4059  = or=20 800.521.0600 x 4059
mark.bobak@il.proquest.com
www.proquest.com
www.csa.com

ProQuest...Start = here.

 

From: oracle-l-bounce@freelists.org [mailto:=20 oracle-l-bounce@freelists.org] On Behalf Of Powell, Mark=20 D
Sent: Tuesday, July 17, 2007 2:11 PM
To:=20 oracle-l
Subject: RE: character set = confusion

 

I thought UTF8 should = be=20 considered obsolete as it is not guaranteed to match the emerging=20 standard and that AL32UTF8 was its replacement.

 

-- Mark D Powell -- =
Phone (313) 592-5148

=

 


------_=_NextPart_001_01C7C8C3.D959E1F9-- -- http://www.freelists.org/webpage/oracle-l