From oracle-l-bounce@freelists.org Fri May 6 10:23:20 2005 Return-Path: Received: from air891.startdedicated.com (root@localhost) by orafaq.com (8.12.10/8.12.10) with ESMTP id j46FNH33018202 for ; Fri, 6 May 2005 10:23:17 -0500 X-ClientAddr: 206.53.239.180 Received: from turing.freelists.org (freelists-180.iquest.net [206.53.239.180]) by air891.startdedicated.com (8.12.10/8.12.10) with ESMTP id j46FNH4Z018198 for ; Fri, 6 May 2005 10:23:17 -0500 Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id AE3E4190E7C; Fri, 6 May 2005 09:20:40 -0500 (EST) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 29276-01; Fri, 6 May 2005 09:20:40 -0500 (EST) Received: from turing (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 25509190E47; Fri, 6 May 2005 09:20:40 -0500 (EST) Message-ID: <4127F81F6CAFC245A18BC49054EFB06301933137@ssslexchusr6.sssl.bskyb.com> From: "Barr, Stephen" To: oracle-l@freelists.org Subject: Statistics madness Date: Fri, 6 May 2005 15:18:10 +0100 MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit X-archive-position: 19398 X-ecartis-version: Ecartis v1.0.0 Sender: oracle-l-bounce@freelists.org Errors-To: oracle-l-bounce@freelists.org X-original-sender: Stephen.Barr@BSkyB.com Precedence: normal Reply-To: Stephen.Barr@BSkyB.com X-list: oracle-l X-Virus-Scanned: by amavisd-new-20030616-p9 (Debian) at avenirtech.net X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on air891.startdedicated.com X-Spam-Status: No, hits=0.2 required=5.0 tests=AWL,LINES_OF_YELLING, LINES_OF_YELLING_2,UPPERCASE_25_50 autolearn=no version=2.63 Oracle 10.1.0.2.0 64bit Solaris 8 E10K I'm having trouble getting the formula's in Wolfgang's document to match up to the actual values I'm seeing returned by explain plans...specifically the formula's involving the high and low values. What am I doing wrong here? PERF_TEST@>desc skew Name Null? Type ----------------------------------------------- -------- -------------- COL_DATE NOT NULL DATE X VARCHAR2(20) PERF_TEST@>select num_rows 2 from user_tables 3 where table_name = 'SKEW' 4 / NUM_ROWS ---------- 4000743 TABLE_NAME : SKEW COLUMN_NAME : COL_DATE DATA_TYPE : DATE DATA_TYPE_MOD : DATA_TYPE_OWNER : DATA_LENGTH : 7 DATA_PRECISION : DATA_SCALE : NULLABLE : N COLUMN_ID : 1 DEFAULT_LENGTH : DATA_DEFAULT : NUM_DISTINCT : 3966 LOW_VALUE : 77C20519010101 HIGH_VALUE : 78690505010101 DENSITY : .000252143217347453 NUM_NULLS : 0 NUM_BUCKETS : 1 LAST_ANALYZED : 06-may-2005 15:06:23 SAMPLE_SIZE : 5111 CHARACTER_SET_NAME : CHAR_COL_DECL_LENGTH : GLOBAL_STATS : YES USER_STATS : NO AVG_COL_LEN : 8 CHAR_LENGTH : 0 CHAR_USED : V80_FMT_IMAGE : NO DATA_UPGRADED : YES HISTOGRAM : NONE Based on Wolfgang's document oracle will use the density of the column when deciding the cardinality of an equality predicate - which works fine... 1 explain plan for 2 select count(*) 3 from skew 4* where col_date = to_date('08-SEP-9999','DD-MON-YYYY') PERF_TEST@>/ Explained. Elapsed: 00:00:00.06 PERF_TEST@>select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------- ----- Plan hash value: 2835825757 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 1371 (22)| 00:00:20 | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | TABLE ACCESS FULL| SKEW | 1024 | 8192 | 1371 (22)| 00:00:20 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("COL_DATE"=TO_DATE('9999-09-08 00:00:00', 'yyyy-mm-dd hh24:mi:ss')) 15 rows selected. So here it just uses the density of the column to calculate the cardinality - num_rows * density. But if we change the query to a ">" predicate it should use the high & low value on the table right? i.e. (Hi - value) / (Hi - Lo) PERF_TEST@>select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------- Plan hash value: 2835825757 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 8 | 1371 (22)| 00:00:20 | | 1 | SORT AGGREGATE | | 1 | 8 | | | |* 2 | TABLE ACCESS FULL| SKEW | 1024 | 8192 | 1371 (22)| 00:00:20 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("COL_DATE">TO_DATE('9999-09-08 00:00:00', 'yyyy-mm-dd hh24:mi:ss')) 15 rows selected. But it doesn't seem to be doing this - it seems to be just applying the density again as in the equality predicate.....what am I missing? Thanks in advance! Steve. ----------------------------------------- Information in this email may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. -- http://www.freelists.org/webpage/oracle-l