From oracle-l-bounce@freelists.org Tue Jan 18 21:50:53 2005 Return-Path: Received: from air891.startdedicated.com (root@localhost) by orafaq.com (8.12.10/8.12.10) with ESMTP id j0J3orCO016502 for ; Tue, 18 Jan 2005 21:50:53 -0600 X-ClientAddr: 206.53.239.180 Received: from turing.freelists.org (freelists-180.iquest.net [206.53.239.180]) by air891.startdedicated.com (8.12.10/8.12.10) with ESMTP id j0J3oraD016498 for ; Tue, 18 Jan 2005 21:50:53 -0600 Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 769DE72C4F4; Tue, 18 Jan 2005 14:06:33 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 02185-62; Tue, 18 Jan 2005 14:06:33 -0500 (EST) Received: from turing (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id E72BE72CF3D; Tue, 18 Jan 2005 14:02:25 -0500 (EST) X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 content-class: urn:content-classes:message MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit Subject: RE: Of Character Sets, Performance and Storage... Date: Tue, 18 Jan 2005 11:55:03 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Of Character Sets, Performance and Storage... Thread-Index: AcT9ePxdxXhsZekdSL2xWg/4rdveaAAABFygAASLwXAAAFwywAAAjnfg From: "Orr, Steve" To: "Orr, Steve" , "Justin Cave (DDBC)" , "oracle-l" X-archive-position: 14891 X-ecartis-version: Ecartis v1.0.0 Sender: oracle-l-bounce@freelists.org Errors-To: oracle-l-bounce@freelists.org X-original-sender: sorr@rightnow.com Precedence: normal Reply-To: sorr@rightnow.com X-list: oracle-l X-Virus-Scanned: by amavisd-new at freelists.org X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on air891.startdedicated.com X-Spam-Status: No, hits=0.0 required=5.0 tests=none autolearn=ham version=2.60 X-Spam-Level: Nevermind... In a state of post send cognition... -----Original Message----- From: Orr, Steve=20 Sent: Tuesday, January 18, 2005 11:44 AM To: 'Justin Cave (DDBC)'; oracle-l Subject: RE: Of Character Sets, Performance and Storage... What about tables (or table partitions) and indexes created with the compress option? Any known effects? I'm wondering if anyone has already looked into this but I'll probably test this myself in a personal attempt to discover truth.=20 Steve Orr -----Original Message----- From: Justin Cave (DDBC) [mailto:jcave@ddbcinc.com]=20 Sent: Tuesday, January 18, 2005 11:32 AM To: Orr, Steve; oracle-l Subject: RE: Of Character Sets, Performance and Storage... You will not see any difference in performance or storage using US7ASCII rather than WE8ISO8859P1. Both are single-byte character sets. If the debate was between storing data in something like AL32UTF8 or WE8ISO8859P1, there would be some differences. Since UTF-8 is a variable length character sets (English characters take 1 byte, European characters generally take 2 bytes, Asian characters generally take 3 bytes), it is relatively space efficient when you have mostly English and European data. On the other hand, searching the data is a little less efficient, because Oracle has to spend cycles figuring out where character boundaries are as it processes each row. If you do a lot of substring searches in data, this can be a noticeable performance impact. Justin Cave Distributed Database Consulting, Inc. http://www.ddbcinc.com/askDDBC -----Original Message----- From: oracle-l-bounce@freelists.org [mailto:oracle-l-bounce@freelists.org] On Behalf Of Orr, Steve Sent: Tuesday, January 18, 2005 12:01 PM To: oracle-l Subject: Of Character Sets, Performance and Storage... I've got this data warehouse with an nls_characterset of WE8ISO8859P1 and an nls_nchar_characterset of AL16UTF16 but all the stored data is only US-ASCII. (The data is from Apache logs capturing tons of web application activity much of which comes from URI's which by W3C definition is US-ASCII.) The Oracle9i Globalization manual states: "Single-byte character sets result in better performance than multi-byte character sets, and they are also more efficient in terms of space requirements." So... The obvious question is, How much more efficient and better performing are they in actual practice? Also, I'm thinking US7ASCII and WE8ISO8859P1 are both single-byte and possess equivalent storage requirements and performance characteristics even though one is 7 bit and the other is 8 bit. Is this true? The current DW is not too big to recreate in the downsized US7ASCII character set but is it even worth it? The DW will be getting quite large so it's important now to address the bits and bytes storage and performance concerns. Of Mice and DBA's, Steve Orr, Bozeman, Montana -- http://www.freelists.org/webpage/oracle-l -- http://www.freelists.org/webpage/oracle-l