Self Joins and optimization

From: David Cressey <cressey73_at_verizon.net>
Date: Sat, 05 May 2007 13:29:29 GMT
Message-ID: <ZE%_h.163$D52.68_at_trndny04>



In the "interpolation" thread, Brian has been expounding on the idea that he can discover algorithms that he knows to be correct, and that outperform anything an optimizer can generate. He's mentioned "self joins" and the idea that he can get a job doen with only one read, where the automatic solution generated in response to declarative code generates multiple passes through the data.

My experience leads me to the opposite conclusion from that.

Here's the questions I'd like the newsgroup to consider:

What, if any, algorithms that are demonstrably correct are going to be overlooked by the optimizer, although discoverable by a human coder? Why would an optimizer fail to discover the possibility of eliminating self-joins? Are there any undecidability issues here?

Aside:

I know a little about the DEC Rdb/VMS optimizer. Less than that about the Oracle CBO, although the Oracle CBO appears to have been conceptually derived from the Rdb/VMS optimizer. Much less than that about the Oracle RBO as of 1994. The RBO appears to have been an optimizer in name only.

While I've got a lot of positive opinions about various features of Oracle, the RBO was not one of those features. Shortly after my cutover to Oracle, one of the tech types told me: "All the cool people use hints."

After looking at it in as much depth as I cared to, my conclusion (which I kept to myself) was: "No. All the really cool people find a DBMS that doesn't require hints to do a good job."

End of aside. Received on Sat May 05 2007 - 15:29:29 CEST

Original text of this message