RE: character set confusion

From: Peter McLarty <p.mclarty_at_cqu.edu.au>
Date: Wed, 18 Jul 2007 08:43:08 +1000
Message-ID: <27AA2E9CA7A0C44283BC1E9B00086AA907427E53@UNIMAIL.staff.ad.cqu.edu.au>

Be very careful and definitely run csscan. I worked with a customer in Thailand that had an applicaion where the front end was cobol. The database had USASCII7 as the characterset and for whatever reason the cobol had allowed the use of Thai Characters, which had become stored in the database. Oracle was storing in effect garbage, that only the cobol application could extract, and it was going to take some work to extract that data. It became apparent when the people were testing and ran a couple of queries in sqlplus and got garbasge in product descriptions or details lines. This data couldnt go into a datawharehouse and in fact it was useless in that database to anything except the cobol application.

The only solution I could see in this case would have been to extract that data using the cobol and then creating a new database with a suitable characterset and running the data back into it.

I only hope its since been fixed but I doubt it.

Cheers

Peter

From: Robyn [mailto:robyn.sands_at_gmail.com] Sent: Wednesday, 18 July 2007 07:07 AM
To: Mark.Bobak_at_il.proquest.com
Cc: mark.powell_at_eds.com; oracle-l
Subject: Re: character set confusion

Hello Marks ...

I appreciate both your answers. UTF8 is being forced upon us by several third party applications, and I doubt we would have the option to change the character set unless we were willing to accept 'unsupported' status. So we will have at least three different systems on UTF8 in the very near future, obsolete or not.

My thinking is very similar to Mark B's thoughts - there are certain characters that are not going to store in our data warehouse eventually if we don't convert the character set to something more internationally friendly. I plan on running cscan but ... we have global manufacturing entering data in our systems, so even if there are no issues now it seems logical that we would eventually hit something that couldn't be stored in the warehouse. Which is why I don't understand Oracle's answer to my question.

And that brings me back to Mark P's input. Since the warehouse is the one place I can choose my character set, it seems that the best approach would be to convert the warehouse to AL32UTF8. Then we'd be prepared for the current requirements and for the first vendor we introduce that uses the newer standard.

thank you both ... Robyn

On 7/17/07, Bobak, Mark <Mark.Bobak_at_il.proquest.com> wrote:

Hmm...well, as I said, "I'm no expert on this subject..." so I'll be quiet now and see if more knowledgeable people have anything to say...;-)

	--
	Mark J. Bobak
	Senior Database Administrator, System & Product Technologies
	ProQuest
	789 E. Eisenhower, Parkway, P.O. Box 1346
	Ann Arbor MI 48106-1346
	734.997.4059  or 800.521.0600 x 4059
	mark.bobak_at_il.proquest.com <mailto:mark.bobak_at_il.proquest.com> 
	www.proquest.com <http://www.proquest.com> 
	www.csa.com <http://www.csa.com> 
	
	ProQuest...Start here. 

	 

	From: oracle-l-bounce_at_freelists.org [mailto:

oracle-l-bounce_at_freelists.org <mailto:oracle-l-bounce_at_freelists.org> ] On Behalf Of Powell, Mark D

	Sent: Tuesday, July 17, 2007 2:11 PM
	To: oracle-l
	Subject: RE: character set confusion

	 

	I thought UTF8 should be considered obsolete as it is not

guaranteed to match the emerging standard and that AL32UTF8 was its replacement.

Mark D Powell -- Phone (313) 592-5148

--
http://www.freelists.org/webpage/oracle-l

Received on Tue Jul 17 2007 - 17:43:08 CDT