Re: Navigation question

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Fri, 16 Feb 2007 18:01:52 GMT
Message-ID: <kkmBh.7306$R71.109354_at_ursa-nb00s0.nbnet.nb.ca>


Marshall wrote:

> On Feb 16, 8:34 am, "dawn" <dawnwolth..._at_gmail.com> wrote:
>

>>On Feb 16, 9:48 am, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>>
>>They
>>have the same number of queries.  How many times do I have to say that
>>I am not, not, not, absolutely not suggesting that we do more round
>>trips or split queries.

>
>
> You split a query. You had tin and you needed contacts, and you
> did it with two queries. I did it with one. Navigation splits queries.
> The performance of navigational approaches relative to declarative
> approaches ranges from as-good-as to unboundedly worse. Yes,
> you kept the total query count the same. However you introduced
> an explicit ordering between the two queries that wasn't there
> before, and that isn't free.
>
> Here's your approach:
>
> select name, companyid from companies where tin = XXX;
> mycompany = companyid;
> select emailaddress from contacts where companyid = mycompany;
>
> vs.
>
> select name from companies where tin = XXX;
> select emailaddress from contacts natural join companies where tin =
> XXX;
>
> Here's a possible execution scenario:
>
> no index on tin
> no index on contacts.companyid
> system has contacts fully loaded in cache
>
> your approach:
>
> select name, companyid from companies where tin = XXX;
>
> 1) requires a full scan of companies because no index on tin
> 2) loads companies from disk, pushing contacts out of cache
> 3) full scan of companies
>
> select emailaddress from contacts where companyid = mycompany;
>
> 3) loads contacts from disk
> 3) full scan of contacts
>
> my approach:
>
> I send these two queries to the system:
>
> q1: select name from companies where tin = XXX;
> q2: select emailaddress from contacts natural join companies where tin
> = XXX;
>
> 1) The system observes that they are both selects so it can order them
> at its pleasure. So it picks q2 first because it has contacts loaded
> in cache.
>
> 2) scans contacts from cache
> 3) requires a full scan of companies because no index on tin
> 4) loads companies from disk, pushing contacts out of cache
> 5) full scan of companies
>
> Your approach needlessly did twice as much disk IO, because
> it hardcoded a navigational approach into the queries.
>
> Your methodology of comparing the best case scenario
> of the two approaches is invalid. The important things to be
> compared are typical case and worst case. What you're
> advocating has unboundedly worse worst-case performance.
>
> I'm still hoping for an estimate of the number of queries and
> number of bytes in my previous example, and an admission
> that in that scenario the one-query approach is the clear
> winner.
>
>
> Marshall

Marshall, I direct your attention to Date's _Principle of Incoherence_.

You are replying to incoherence without putting in sufficient effort to do so coherently. That only plays into the hands of the self-aggrandizing ignorants. Received on Fri Feb 16 2007 - 19:01:52 CET

Original text of this message