Relations and correlations
Date: 14 May 2008 09:44:46 GMT
DBMS_STATS package in version 11g includes "CREATE_EXTENDED_STATS" function. This package creates an "extension", consisting of group of columns, for which cardinality and distribution are calculated separately. This is probably the 1st and the last place to note that I absolutely abhor the fact that it is a function and not a procedure. Other then that, it is a great tool for handling correlated columns, like the famous Jonathan's example of months and zodiac signs. Below is a little script to test the new feature:
set trimout on
set trimspool on
drop table test_stats purge;
CREATE TABLE TEST_STATS (
A NUMBER(12,0), B NUMBER(12,0), C VARCHAR2(32)
select to_number(to_char(sysdate,'J')) into sd from dual; dbms_random.seed(sd);
for sd in 1..100000
insert into test_stats values(i,i,t);
select dbms_stats.create_extended_stats('SCOTT','TEST_STATS','(A,B)') into ext
dbms_output.put_line('Created extension is:'||ext); end;
prompt 'Analyzing table...'
analyze table test_stats compute statistics for all columns size 16; column column_name format a32
select column_name,count(*) endpoints from user_histograms where table_name='TEST_STATS'
group by column_name
order by column_name;
After the script is finished, you should see the following result:
Created extension is:SYS_STUNA$6DVXJXTP05EH56DTIR0X
PL/SQL procedure successfully completed.
COLUMN_NAME ENDPOINTS -------------------------------- ---------- A 17 B 17 C 17 SYS_STUNA$6DVXJXTP05EH56DTIR0X 17
Disconnected from Oracle Database 11g Enterprise Edition Release 126.96.36.199.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Not only that, but a closer examination detects that DBMS_METADATA package now returns the following as the DDL for the table:
CREATE TABLE "SCOTT"."TEST_STATS"
( "SYS_STUNA$6DVXJXTP05EH56DTIR0X" NUMBER GENERATED ALWAYS AS (SYS_OP_COMBINED_HASH("A","B")) VIRTUAL HIDDEN ,
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "USERS" ; There is a virtual and hidden column added to our table, as hash of columns A and B. Mere "DESCRIBE" will not show the virtual column and neither will USER_TAB_COLUMNS.
Virtual columns, with "GENERATED ALWAYS" option are the new feature of Oracle11g. Virtual columns are described here:http://tinyurl.com/44wm46
I found out that generating extensions for the columns that are frequently used together in joins can help with getting a better plan. I cannot give an example because of the NDA. The database, after all, belongs to my employer and not to me.
-- Mladen Gogala http://mgogala.freehostia.comReceived on Wed May 14 2008 - 04:44:46 CDT