Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Data Skew

Re: Data Skew

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Sun, 17 Apr 2005 10:22:39 +0000 (UTC)
Message-ID: <d3tddf$4pn$1@hercules.btinternet.com>


"tq" <tomm_quinn_at_yahoo.com> wrote in message news:1113596932.125097.27120_at_l41g2000cwc.googlegroups.com...
>I have the following situation -
>
> Table A has 10 rows and the primary key is an sequence generated id.
> The only other column is a 'description' field.
>
> Table B has 10 million records and has an indexed foreign key pointing
> to the table A pk.
>
> The foreign key values in the table B are massively skewed. 95% of the
> records have a single value. And any queries against these tables can
> only use the 'description' column in Table A to limit the records
> produced in the join to table B.
>
> So histograms on the fk are not effective since the limit is on the
> description column. How can I get the optimizer to know the optimal
> access strategy for table B (based on the skew) if the queries can only
> be limited on table A.description?
>
> Thanks.
>

You could try faking this information into place using the dbms_stats package, specifically the set_column_stats / get_column_stats packages.

Create a table with a column that matches the description column. Insert the descriptions for each column once, then insert the critical description 190 more times.

Create a histogram on the column - this will generate column stats that say:

    'Description X covers 95% of the data     and there is one row for each of the     other values"

Write a little pl/sql that does:

    dbms_stats.get_column_stats() on this column, then

    dbms_stats.set_column_stats() on the real column.

It may not work because you will have overall stats that say:

    There are 10 rows in the table, but 200 rows     for one column; but it might work.

(You might be able to work around the 200/10 problem by creating a table with 200 rows, with the same low/high on the key column) and transfer the entire data set from the fake to the real.

If you try this, and it works, please let me know.

-- 
Regards

Jonathan Lewis

http://www.jlcomp.demon.co.uk/faq/ind_faq.html
The Co-operative Oracle Users' FAQ

http://www.jlcomp.demon.co.uk/seminar.html
Public Appearances - schedule updated April 5th 2005
Received on Sun Apr 17 2005 - 05:22:39 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US