Path: news.f.de.plusline.net!news-fra1.dfn.de!newsfeed.hanau.net!news.tiscali.de!newsfeed.freenet.de!newspeer1.nwr.nac.net!border2.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!nx01.iad01.newshosting.com!newshosting.com!post01.iad01!news.aliant.net!not-for-mail
Date: Sat, 05 May 2007 17:14:52 -0300
From: Bob Badour <bbadour@pei.sympatico.ca>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
Newsgroups: comp.databases.theory
Subject: Re: Self Joins and optimization
References: <ZE%_h.163$D52.68@trndny04> <1178386212.605808.145550@n76g2000hsh.googlegroups.com>
In-Reply-To: <1178386212.605808.145550@n76g2000hsh.googlegroups.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 60
Message-ID: <463ce58d$0$4041$9a566e8b@news.aliant.net>
X-Complaints-To: abuse@aliant.net
Xref: news.f.de.plusline.net comp.databases.theory:43809

Kevin Kirkpatrick wrote:

> David Cressey wrote:
> 
>>In the "interpolation" thread,  Brian has been expounding on the idea that
>>he can discover algorithms that he knows to be correct,  and that outperform
>>anything an optimizer can generate. He's mentioned "self joins"  and the
>>idea that he can get a job doen with only one read,  where the automatic
>>solution generated in response to declarative code generates multiple passes
>>through the data.
>>
>>My experience leads me to the opposite conclusion from that.
>>
>>Here's the questions I'd like the newsgroup to consider:
>>
>>What, if any, algorithms that are demonstrably correct are going to be
>>overlooked by the optimizer,  although discoverable by a human coder?  Why
>>would an optimizer fail to discover the possibility of eliminating
>>self-joins?  Are there any undecidability issues here?
>>
>>Aside:
>>
>>I know a little about the DEC Rdb/VMS optimizer.  Less than that about the
>>Oracle CBO,  although the Oracle CBO appears to have been conceptually
>>derived from the Rdb/VMS optimizer.  Much less than that about the Oracle
>>RBO as of 1994.  The RBO appears to have been an optimizer in name only.
>>
>>While I've got a lot of positive opinions about various features of Oracle,
>>the RBO was not one of those features.  Shortly after my cutover to Oracle,
>>one of the tech types told me:  "All the cool people use hints."
>>
>>After looking at it in as much depth as I cared to,  my conclusion (which I
>>kept to myself) was:  "No.  All the really cool people find a DBMS that
>>doesn't require hints to do a good job."
>>
>>End of aside.
> 
> A simple case study: extend a relation by adding an attribute that
> gives the rank of each tuple based on the ordering of a subset of
> existing attributes of the relation.  With primitive relational
> operators, the only way I'm aware of to express this is:
> 
> SELECT E1.NAME, E1.SAL, COUNT(*) RNK
> FROM EMP E1 EMP E2
> WHERE E2.SAL <= E1.SAL
> GROUP BY E1.NAME, E1.SAL;

The above has a serious bug in it. What happens when 10 people all have 
the same salary?

s/b WHERE E2.SAL < E1.SAL OR E1.somekey = E2.somekey


> I'm not aware of any rationale that would allow an optimizer to
> compute the above with anything faster than a O(n^2) algorithm.

The same rationale that allows any other method to perform a rank query 
in a timely manner. It is simply a matter of recognizing the rank query.

[snip]