ACM SIGMOD-papers of interest #1

From: Henry Poras <Henry.Poras_at_ctp.com>
Date: Mon, 26 Nov 2001 14:48:05 -0800
Message-ID: <F001.003CD3A8.20011126143122@fatcity.com>

Well, I'm way behind on my reading so I am just finishing a quick (?) skim of the May proceedings of the 2001 ACM SIGMOD Conference on Management of Data. There is some interesting stuff in there so I thought I would try to review some of the high points I found. ACM SIGMOD is the Special Interest Group on Management of Data from the Association for Computing Machinery. You can join for just $20 and get totally swamped with CDs and publications.
(www.acm.org)

There are a few interesting articles. This is installment #1

"DBMSs On A Modern Processor: Where Does Time Go?"
(http://www-2.cs.cmu.edu/~natassa/papers/vldb99_paper.pdf)
Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, David A. Wood University of Wisconsin - Madison

from Proceedings of the 25th VLDB Conference, 1999

This paper was actually referenced in a paper from ACM SIGMOD 2001
("Improving Index Performance through Prefetching")

The main point is that even as we concentrate on increasing performance by tuning IO, the hit from processor cache misses is becoming more important. The increasing gap between processor speed and DRAM/disk speeds is accentuating this problem. Most DBMSs focus on caching data in main memory
(buffer cache), but this ignores the caching of main memory in level 1 and
level 2 processor caches. Cache misses can account for 50% of execution time.

In this paper, the authors examine four commercial (unnamed) DBMSs running on a 6400 PII Intel Xeon/MT Workstation running Windows NT v4.0. The focus is on the memory interactions, so to reduce IO effects, a memory resident database is used (the buffer pool was large enough to hold the datasets for the queries). Almost half the execution time was spent on stalls. The breakdown is as follows:

*       90% of the stalls are from:
        -second-level cache data misses
        -first -level cache instruction misses
*       20% of the stalls are from "subtle implemention details (e.g. branch

misdirection)"

Analysis was done using simple queries (sequential range, index range, sequential join). The results were compared to TPC-D (and TPC-C?) benchmarks which yielded similar results. It thus appears reasonable to scale the conclusions of this simple methodology to more complex scenarios.

DETAILS
Query time = computation time + memory stalls + branch misdirection overhead + resource related stalls - overlap (some work can be done while waiting for a stall).

Computation time is usually less than 1/2 of the execution time. Since memory access times decrease more slowly than processor clock speeds, the computation time componant will continue to decrease. Most of the workload is also seen to be latency, not bandwidth bound (latency - how long it takes. bandwidth - how much you can do in a given time. If you are latency bound, adding more processors won't help as the information isn't getting there fast enough. [summarized from In Search of Clusters. thanks Ross])

Memory stall times vary more across different query types than across different DBMSs. Memory stall is the most significant one of the three major stall types. The bulk of the memory stall is from L1-information cache and L2-data cache. It is possible, however, that tuning for one or two of the stall types will just shift the bottleneck to the remaining stalls. Memory stalls are also dependent on increasing record size [locality of data].

Henry

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Henry Poras
  INET: Henry.Poras_at_ctp.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L


(or the name of mailing list you want to be removed from).  You may

also send the HELP command for other information (like subscribing).

Received on Mon Nov 26 2001 - 16:48:05 CST