Migration from WE8ISO8859P1 to AL32UTF8: Switch NLS_LENGTH_SEMANTICS from BYTE to CHAR?
Date: Mon, 18 Aug 2008 11:48:32 +0200
Should we switch NLS_LENGTH_SEMANTICS from BYTE to CHAR when migrating the database character set from the single-byte WE8ISO8859P1 to the multi-byte AL32UTF8?
We have 40 customers running the same 5M line PL/SQL application on Oracle 10.2.0.4. Their DBs range in size from 200 GB to 2 TB. We want to support Chinese, Arabic and Russian in the next release. We plan to switch the national character set from WE8ISO8859P1 to AL32UTF8 rather than use the National Character set (NVARCHAR/NCLOB) because of the large PL/SQL code base (not using %TYPE/%ROWTYPE in generic libraries) and because usage of the additional characters is not limited to a few columns only.
We are now wondering whether we should also migrate the NLS_LENGTH_SEMANTICS from BYTE to CHAR to avoid the problems with a varchar2(n) not being able to store n characters.
At first sight, NLS_LENGTH_SEMANTICS=CHAR makes a lot more sense
together with a multi-byte character set like AL32UTF8. However,
* As pointed out by Metalink Note 144808.1, there are many
limitations (e.g., maximum length of varchar2 columns remains 4,000/32,767 bytes rather than characters, SYS schema objects must have BYTE semantics) and bugs.
* The migration is painful. Whereas Oracle helps with the change
of the character set through csscan/csalter (still a lot of manual work exporting and reimporting tables with non-convertible columns), the migration of NLS_LENGTH_SEMANTICS requires a lot of manual work (scriptable, but not provided by Oracle) as described in Metalink Notes 313175.1 and 330964.1.
* It is not clear, whether Oracle really recommends
NLS_LENGTH_SEMANTICS=CHAR as the way to go. For example, http://www.oracle.com/technology/tech/globalization/pdf/TWP_Character_Se t_Migration_Best_Practices_10gR2.pdf does not talk about NLS_LENGTH_SEMANTICS and Metalink Note 144808.1 states that Oracle Text and E-Business Suite don't even support NLS_LENGTH_SEMANTICS. (http://www.oracle.com/technology/tech/globalization/pdf/TWP_NCHAR_MIGRA TION_10gR2.pdf talks briefly about the advantages of character length semantics, but only in the context of National Character set. The Database Globalization Support Guide
(http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14225/toc .htm) doesn't contain much information either.)
Hence my questions to the oracle-l community:
* Do you recommend switching the NLS_LENGTH_SEMANTICS from BYTE to
* Are there any tools/scripts, that make the switch easier than
the few fragments referenced above (we use CLOBs, partitioning, function-based indexes and content in source$ that would suffer from trunctation)?
* Any indications that Oracle sees NLS_LENGTH_SEMANTICS=CHAR as
the way to go and will provide better support (e.g., automatic conversion in 11gR2 and support for 4,000/32,767 characters :)?
* Anybody running SAP, Siebel, Oracle Applications or another
large application with NLS_LENGTH_SEMANTICS=CHAR?
MartinReceived on Mon Aug 18 2008 - 04:48:32 CDT