Re: 11G Statistics Collection

From: Robert Klemme <shortcutter_at_googlemail.com>
Date: Tue, 17 Apr 2012 12:35:57 +0200
Message-ID: <9v4vciFgamU1_at_mid.individual.net>



On 17.04.2012 00:47, Mladen Gogala wrote:
> On Mon, 16 Apr 2012 22:15:49 +0200, Robert Klemme wrote:
>
>> I don't understand: if data distribution at time of "nailing" favors one
>> index while at other times access via this index is slower than FTS how
>> then would you call that a "working approach" - other than the query
>> will eventually return (modulo snapshot too old or other issues)?
>
> Robert, the greatest problem in the system room is called "instability".
> If one nice morning end users discover that what used to perform well the
> last night is now horribly slow, the DBA is in trouble and his credibility
> is undermined.

OK, got your point. Thank you for taking the time to elaborate! I think we are actually doing both communities (DBA and development) a favor with this discussion because it helps improve understanding of the other group.

> On the other hand, the change because of the volume is never drastic, the
> change because of data volume is always gradual: there are few blocks
> more, the query will be a few milliseconds slower. That I can live with.
> When things slow down over night because of the new statistics, the change
> is very sudden, as in "sudden death". That I cannot live with.

I agree, this is probably the majority of cases. Maybe I am worrying about something that doesn't happen in practice. Could well be. Still it would be interesting to see how to deal with data which changes so dramatically that a fixed plan will break an application.

> When the performance begins to cause the concern, I will see what I can
> do, first on the development system, then QA system, then UAT system,
> using Swingbench, and eventually, the change will get into production.
> Controlling the change to the production database is one of the two major
> duties of any DBA. The second is to ensure data availability and never to
> lose even one bit of data. The change is not good, as PHB would like you
> believe. The change is only good if its result is predictable. That is not
> the case with gathering the database statistics.
> I've seen many situations where insufficiently thought of statistics
> collection has caused performance deterioration.
> Collecting statistics is a big change that has potentially an enormous
> impact on performance. I don't want to collect statistics automatically
> any more that I want to give my development department access to
> production.

But isn't that what plan stability is for in 11g? As far as I have understood the feature a plan stays fixed until the DB can "prove" a significant improvement. Whether it always works as advertised is a different question though. :-)

> I don't understand your reasoning. Would you allow a duhveloper with a
> cunning plan of the Baldrick type to implement it in production without
> the proper QA? If the answer to that question is a resounding "no", why
> would you allow a potentially disastrous thing like stats collection to
> happen without QA, automatically?

Well, you could QA automatically collected statistics, too. It's likely significant more effort though. And if one does not expect a dramatic improvement from this (because e.g. IO is limited by some physical properties of the hardware) it's probably not worthwhile to invest the effort to QA that feature.

> This is something that every DBA should have in his cubicle, to resist an
> impulse to change things with the weather or new technology:

On the one hand I fully understand your point and support it. On the other hand it's a fact that engineers try to make systems smarter to get better results - databases are no exception. Change is not generally good or bad - it just needs to be carefully evaluated. I think we agree on that. Maybe for databases the expected improvement needs to be dramatic to justify the risk.

We had a similar situation recently: with the advent of Java G1 garbage collector in Java 7 we were hoping our application's GC issues are finally resolved. Turns out, the old parallel collectors were better in our case - less CPU overhead, less excessive pauses. Only testing could reveal that. So we stuck with the old one.

> search.dilbert.com/comic/You%20Must%20Learn%20That%20Change%20Is%20Good

Nice. :-)

Cheers

        robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
Received on Tue Apr 17 2012 - 05:35:57 CDT

Original text of this message