Re: Navigation question

From: Marshall <marshall.spight_at_gmail.com>
Date: 15 Feb 2007 14:51:45 -0800
Message-ID: <1171579905.536595.137690_at_s48g2000cws.googlegroups.com>


On Feb 15, 12:15 pm, "dawn" <dawnwolth..._at_gmail.com> wrote:
> On Feb 15, 10:58 am, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>
> > > > select name from companies where tin = XXX;
>
> > > OK, I'll buy that you would not use the term "navigation" to find this
> > > node. I still think of this select statement as "going to" the
> > > company where tin=XXX and extracting the name, where I suspect you do
> > > not. I do not see it as navigating within the database, however, so
> > > we are good so far.
>
> > A guy Fred at work thinks of this statement as x86 machine code,
> > and Joe thinks of it all as electrons whirling around silicon
> > pathways. Engelbart says they are both wrong and that it's
> > all subatomic particles, while Desmond claims they are all
> > thinking at the wrong level; the truth we can
> > perceive is just shadows on Plato's cave.
>
> Yup, so conceptually, I'm navigating and there is no problem with
> that, right?

That's the opposite of what I'm saying.

Fred, Joe, Engelbart, Desmond, and you are all looking at the SQL and thinking about something else. That *is* a problem. The *query* contains no navigation. There might be a model of the query in your head that includes navigation, yes. But that's a statement about your head, not a statement about the query.

Sometimes when I'm looking at Java code I'm actually thinking about Scarlett Johansson. There is however no Scarlett Johansson in the code. And the fact that I'm thinking about Scarlett Johansson actually interferes with my ability to focus on and correctly perceive the reality of Java code. Thinking at the appropriate level of abstraction works best.

The reason garbage collection is a win is because it allows the programmer not to even have to think about allocation and deallocation. Now if you're feeling perverse, or if some very narrow circumstances warrant it, then maybe you want to be thinking about allocation and deallocation in your Java code. But it's best if you can avoid it.

It is worth mentioning that 1) programmers are almost always of the belief that they understand the performance characteristics of their code even without measuring it, and that 2) they are almost always wrong. The compiler/VM/query optimizer can do better than a human can, almost always, and it never gets distracted by thinking about navigation or Scarlett Johansson.

> > > So instead of retrieving the companyid in the first statement and then
> > > narrowing this query down to that precise company, we again "find" the
> > > right company by using the same query as before. We include two
> > > tables in our query although if we had retrieved the companyid in the
> > > previous query we could have used just one. I do not understand how
> > > that could yield the same or better performance than
>
> > > select name, companyid from companies where tin = XXX;
> > > mycompany = companyid;
> > > select emailaddress from contacts where companyid = mycompany;
>
> > The canonical case to consider is client-server.
>
> > Database performance is not my strong area, however it's easy
> > enough to see that in my case, there will be an indexed lookup
> > on tin that gets companyid, and then whatever mechanism
> > (table scan?) will select all the email contacts based on that
> > company id, in a single network round trip. Whereas your
> > approach will do the indexed lookup in one network packet,
> > then the client will issue a second request that is the same
> > as the second part of my query.
>
> No, no -- I was replacing your two SQL statements with my two SQL
> statements. There are the same number of round trips. Please review
> this and reply, if you would. Thanks. --dawn

I see.

So, you could make an argument about there being a difference; in practice I would be astonished if there was one. I have on a number of occasions gone to some lengths to refactor things to eliminate one or several indexed lookups from the query plan of a query in the belief that it would help performance; it never has so far. Index scans are ridiculously fast. Any difference is speed has been less than the difference from one query run to the next; it's in the noise. And the more complicated the query is, the harder your technique will be and the proprtionally smaller any already small effect the removal of the index scan will be.

Marshall Received on Thu Feb 15 2007 - 23:51:45 CET

Original text of this message