Query optimization in DB2 and Rdb/VMS

From: Laconic2 <laconic2_at_comcast.net>
Date: Fri, 12 Nov 2004 09:50:32 -0500
Message-ID: <ptadnfctE6EjUAncRVn-2A_at_comcast.com>



Thanks to Jan Hidders and Alfredo Novoa for making the Graefe(1993) paper locatable by members of this NG.

I asked Jan to tell me whether the paper mentioned Rdb, because I wanted to discuss this with Marshall Spight. However, I keep referencing Rdb/VMS over and over, and I'm concerned that I might be beating a dead horse. It's not that I think Rdb/VMS is more important than the others. It's just that I know it better.

And I'll conjecture that Rdb/VMS had more sophistication in its query handling way back in 1993, than MySQL has today. I'll bet the same goes for DB2.

Anyway, Marshall, here's a quote from the paper:

<<
Obviously, several of these techniques
can be combined. In addition, some
systems such as Rdb/VMS and DB2 use
very sophisticated implementations of
multiindex scans that decide dynamically, i.e., during run-time, which indices
to scan, whether scanning a particular
index reduces the resulting RID list sufficiently to offset the cost of the index
scan, and whether to use bit vector filtering for the RID list intersection (see a
later section on bit vector filtering)
[Antoshenkov 1993; Mohan et al. 1990].
>>

Now I'm not one of the engineers who built Rdb. In fact I only took the Internals course for Rdb once.

But I both taught and practiced optimization of Rdb/VMS databases. In practice the best result I ever achieved was reducing the elapsed time from the user's point of view for a given transaction (a fairly common one) from 15 minutes to 2 seconds. Part of that improvement was the creation of an index. I didn't have the liberty to test how much improvement the index, without the other fixes, would have produced. Anyway, you can drive the optimizer without knowing how to build one, just as you can drive a car without knowing how to build one.

The multiple index scan for Rdb/VMS was what the engineers called a "zig-zag join". To my untutored brain, the algorithm for the zig-zag join looked like a minor variation on the merge join. What made it blindingly fast is that the real cost had been paid at index update time. And the optimizer didn't always pick this technique. It depended on the projected cost, and on the projected cost of the alternatives.

I'd be interested to know what DB2 experts have to say on this. I've taught Rdb/VMS just once to a DB2 oriented shop. That's not enough background for me to say anything at all about DB2. Received on Fri Nov 12 2004 - 15:50:32 CET

Original text of this message