Re: Optimizer issue - cost of full table scans

From: Niall Litchfield <niall.litchfield_at_gmail.com>
Date: Tue, 14 Sep 2010 10:05:52 +0100
Message-ID: <AANLkTikc0f3t_CnJoWjhgC0MQiEhPJzdfKyCDCqWZoTV_at_mail.gmail.com>



Well, system stats generally boil down to what happens to sreadtim and mreadtim and the ratio between them. I'm not clear what would happen on a Database Machine, but I can imagine it *might* break the assumptions of the system stat costing model, not least that Multi Block reads are slower than single block reads.

I personally on a non-exadata enabled machine would deal with the stability issue that Greg mentions by grabbing system stats repeatedly over a period (a week or a month most likely) to a user defined stattab and then sticking them in a spreadsheet to determine what values are representative of the system. You can then just set the stats once and for all. After all I/O and CPU capability *rarely *changes in a normal server (VMs of course blow this assumption out of the water). I may be overly optimistic but I personally believe (and it is mostly a matter of logic/faith sadly I don't have enough actual data to say one way or the other) that representative system statistics are likely to be a sensible choice for *most* systems - certainly ones where the OICA and OIC tricks are being used.

On 13 Sep 2010 21:43, "Job Miller" <jobmiller_at_yahoo.com> wrote:

Niall,

I don't know how system stats are calculated, but if they are measured from the perspective of the database server, wouldn't they reflect the benefits of exadata storage?

The server knows it needs to do a FTS, the high water mark says that requires reading N blocks, exadata storage skips 50% of the blocks via the storage indexes, and the multi-block read times from the servers perspective look really fast because storage optimized away some of the i/o.

If that's true, the system stats would measure the typical system benefit from exadata storage experienced during the system stats workload capture.

Does that make sense? What am I missing?

> How does one maintain how much of a table is on disk vs memory? A
> costing model that considers all those cases becomes increasingly more

> difficult to program and manage.

The costing model doesn't really attempt to consider all that though does it? I just thought that it measures the average system-wide performance based on your workload during capture. Hopefully what happens during that workload represents your cache ratios for data, your data placement ratios and your true typical multi-block average read rates.

That average system-wide multi-block read should lead to pretty good plans for everything. instead of good only for a few special cases.

  • On Mon, 9/13/10, Greg Rahn <greg_at_structureddata.org> wrote:

> From: Greg Rahn <greg_at_structureddata.org>
> Subject: Re: Optimizer issue - cost of full table scans
> To: "Niall Litchfield" <niall.litchfield_at_gmail.com>
> Cc: "ORACLE-L" <oracle-l_at_freelists.org>
> Date: Monday, September 13, 2010, 2:16 PM

> There are no special changes as of
> today. When I made the reference
> to representative stats, ...

> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 14 2010 - 04:05:52 CDT

Original text of this message