Kevin Kirkpatrick wrote:
> David Cressey wrote:
>
>>In the "interpolation" thread, Brian has been expounding on the idea that
>>he can discover algorithms that he knows to be correct, and that outperform
>>anything an optimizer can generate. He's mentioned "self joins" and the
>>idea that he can get a job doen with only one read, where the automatic
>>solution generated in response to declarative code generates multiple passes
>>through the data.
>>
>>My experience leads me to the opposite conclusion from that.
>>
>>Here's the questions I'd like the newsgroup to consider:
>>
>>What, if any, algorithms that are demonstrably correct are going to be
>>overlooked by the optimizer, although discoverable by a human coder? Why
>>would an optimizer fail to discover the possibility of eliminating
>>self-joins? Are there any undecidability issues here?
>>
>>Aside:
>>
>>I know a little about the DEC Rdb/VMS optimizer. Less than that about the
>>Oracle CBO, although the Oracle CBO appears to have been conceptually
>>derived from the Rdb/VMS optimizer. Much less than that about the Oracle
>>RBO as of 1994. The RBO appears to have been an optimizer in name only.
>>
>>While I've got a lot of positive opinions about various features of Oracle,
>>the RBO was not one of those features. Shortly after my cutover to Oracle,
>>one of the tech types told me: "All the cool people use hints."
>>
>>After looking at it in as much depth as I cared to, my conclusion (which I
>>kept to myself) was: "No. All the really cool people find a DBMS that
>>doesn't require hints to do a good job."
>>
>>End of aside.
>
> A simple case study: extend a relation by adding an attribute that
> gives the rank of each tuple based on the ordering of a subset of
> existing attributes of the relation. With primitive relational
> operators, the only way I'm aware of to express this is:
>
> SELECT E1.NAME, E1.SAL, COUNT(*) RNK
> FROM EMP E1 EMP E2
> WHERE E2.SAL <= E1.SAL
> GROUP BY E1.NAME, E1.SAL;
The above has a serious bug in it. What happens when 10 people all have
the same salary?
s/b WHERE E2.SAL < E1.SAL OR E1.somekey = E2.somekey
> I'm not aware of any rationale that would allow an optimizer to
> compute the above with anything faster than a O(n^2) algorithm.
The same rationale that allows any other method to perform a rank query
in a timely manner. It is simply a matter of recognizing the rank query.
[snip]
Received on Sat May 05 2007 - 15:14:52 CDT