RE: Optimizer Stats collection for Datawarehouses

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Thu, 19 Nov 2009 17:15:53 -0500
Message-ID: <F4CACDE1814D49B6B3E33CF09DC1AE5E_at_rsiz.com>



one last thing, really. I thought that was a JL article Jaromir was referencing, just because it was from the JL maintained site. Now that I've read the paper, I see that it was actually written by Jaromir as well. Nice read, by the way.

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mark W. Farnham
Sent: Thursday, November 19, 2009 9:30 AM To: jaromir_at_db-nemec.com; smishra_97_at_yahoo.com Cc: oracle-l_at_freelists.org
Subject: RE: Optimizer Stats collection for Datawarehouses

all good advice. and it is especially important in some cases to update the low/high values for some columns to what they might be at the end of the day, week, or month rather than what values of the empty new partition are. Jonathan might have mentioned that, but I didn't re-read the link (possibly I've never read it, but I'm sure I've heard Jonathan's reasoning on the matter.) As per usual you can probably trust JL, but he makes it easy to confirm his results for your case by experiment as well. Likewise you can test your mileage on plans generated with low/high statistics set for the set of columns where you know the slightly future answer. That is the sweet spot where human knowledge of what is likely in the future for a specific application beats the default auto stuff. While it is technically feasible to create a few models of synthesizing new partition statistics from old partitions, I'm not aware of it being there or even anyone thinking about it seriously yet. Even then someone with knowledge of the appropriate abstraction from existing partitions to new would have to suggest which model to use. But you probably know reasonable low/high values for the key predicate columns of the new partition when you make it.

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of jaromir nemec
Sent: Thursday, November 19, 2009 6:10 AM To: smishra_97_at_yahoo.com
Cc: oracle-l_at_freelists.org
Subject: Re: Optimizer Stats collection for Datawarehouses

Hi Sanjay,

In case the partition and loading schema are equal the statistics should be calculated by the application (as Nuno pointed) - i.e. create partition, load and analyze in one step. If the partitioning and loading schema differ highly (e.g. monthly partition populated in real time) the same is valid as for OLTP objects (monitoring or periodic gathering).

> when new partition created and utiliZed can experience some performance
> slowness on the first day of the month

One possible solution is to set some precalculated statistics after the creation of the empty partition. The idea behind is: the CBO is probably less confused if it things that all partitions are full filled. One catastrophic scenario of projection of the statistics from empty partition to filled partition is documented under
http://www.jlcomp.demon.co.uk/faq/bind_peek.html

Additionally:
Staging objects (used few times only) are not analyzed and processed with dynamic sampling

Regards,

Jaromir

--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l
Received on Thu Nov 19 2009 - 16:15:53 CST

Original text of this message