Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: dealing with skewed data

Re: dealing with skewed data

From: Mark D Powell <Mark.Powell_at_eds.com>
Date: 16 Jan 2004 05:45:34 -0800
Message-ID: <2687bb95.0401160545.acd0aad@posting.google.com>


Richard Kuhler <noone_at_nowhere.com> wrote in message news:<MyFNb.82210$Vs3.69341_at_twister.socal.rr.com>...
> I'm looking for suggestions and experience on dealing with skewed data
> in a datawarehouse. Unfortunately, I don't see how histograms can help
> with what I believe is a fairly standard star schema. Imagine a
> dimension table D that has two rows with natural keys 'D1' and 'D2'.
> Imagine a fact table F that has 10 million rows. The typical query
> looks something like this:
>
> select f.*
> from d, f
> where d.key = 'D1'
> and f.d_id = d.d_id
>
> Now let's say 99% of those rows belong to dimension 'D1' and 1% to 'D2'.
> When querying with 'D1' we want to do a full table scan, with 'D2' an
> index scan is faster.
>
> What can be done to get the optimizer on track? Obviously, we could
> eliminate the surrogate keys and then build a histogram on the fact
> table. Is that how this is typically handled? We're also considering
> bitmap join indexes but in reality these queries typically select many
> columns from the dimension tables as well.
>
> Ideas?
>
>
> Oracle9i Enterprise Edition Release 9.2.0.2.0 - Production
> With the Partitioning, OLAP and Oracle Data Mining options
> JServer Release 9.2.0.2.0 - Production

Richard, we had a situation similar to your in our OLTP. The CBO said full table scan. The developer wanted the query to use an index. The index only had six distinct values of which 85% of the data had one value.

I looked at it and demonstrated that the CBO was correct for 85% of the data in our case so the developer coded a test on the bind variable. If it was the 85% value we let the optimizer run the statement the way it wants. For any other value we execute a hinted version of the statement that uses the index.

HTH -- Mark D Powell -- Received on Fri Jan 16 2004 - 07:45:34 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US