Re: Slow performance on count(*)?

From: joel garry <>
Date: Tue, 15 Jul 2008 10:31:22 -0700 (PDT)
Message-ID: <>

On Jul 15, 6:53 am, wrote:
> On Jul 14, 8:53 pm, Dion Cho <> wrote:
> > It's interesting that "index full scan" is being used here, but it's
> > quite possible.
> > This is not a degenerate case, actually very common and natural case.
> > The simple fact is that delete never deallocates the free space.
> > Only truncate or rebuild does it.
> > Dion Cho
> Well, I call it a degenerate case as it would be something that would
> only occur in:
> 1. An unconstrained query (or one that didn't constrain against
> indexed values)
> 2. against a table with a large number number of rows
> 3.  that had had a large percentage of those rows deleted
> From what has been said so far, all three of these conditions need to
> be true in order for you to see poor performance like this.  It is is
> unusual for any app we write to do unconstrained queries against very
> large tables; it is also unusual for us to dramatically reduce the
> size of a particular table once it is in production via a massive
> delete.  The only common real-world case where I can see something
> like this happening is in batch-processing situation where
> intermediate staging tables are loaded up with production data and
> subsequently queried, then cleaned out; however, in this case, I would
> expect that you would use global temporary tables for the
> intermediaries.

1 not true. There simply has to be a lot of empty space below the high watermark.

> Can you think of other examples where this type of problem could
> arise?  Our application (which has been running 24/7 since 1998 and
> has been upgraded many times; we've always handled our data reaping by
> running nightly jobs to remove the oldest day's worth of data, and we
> haven't run into issues with this strategy.  We'd like to eventually
> move to using partitioned tables to make the reaping a little more
> elegant, but the nature of the data is such it is rather tricky to do
> so without a significant amount of work & migration (as new data has
> references to older data, so it's not a simple matter of partitioning
> by create date).  We've had no real issues, presumably, because (1)
> our queries all go against indexes and (2) the daily volume on the
> system grows at a very slow rate (maybe 100% per year), so there isn't
> really an issue with "shrinkage", even though the tables probably have
> a fair amount of wasted space (due to holes not filling up until pages
> completely free up).

Yes, see comment 14, that is why partitioning is good for these types of things, .

If you have performance problems, consider evaluating whether sparse data is contributing to it. (This is the kind of thing where I disagree with methodologies that put too much emphasis on the business deciding which performance problems are most important. The unstated assumption is that there aren't stupid things that are slowing everything down without being the primary cause of bottlenecks. I think that may be an overeaction to compulsive tuning syndrome, where every little thing needs to be looked at whether or not it is important, as well as myths like you need to rebuild indices often. Or maybe it's just marketing.)


-- is bogus.
That darn craigslist.
Received on Tue Jul 15 2008 - 12:31:22 CDT

Original text of this message