Re: Navigation question

From: Marshall <marshall.spight_at_gmail.com>
Date: 15 Feb 2007 17:04:29 -0800
Message-ID: <1171587869.536207.101540_at_h3g2000cwc.googlegroups.com>

On Feb 15, 4:11 pm, "dawn" <dawnwolth..._at_gmail.com> wrote:
> On Feb 15, 4:51 pm, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>
> > Fred, Joe, Engelbart, Desmond, and you are all looking
> > at the SQL and thinking about something else.
>
> Actually, I'm thinking conceptually (navigation), then modeling
> logically (in this case with SQL).

You might think you're thinking at the conceptual level, but you're not. You're thinking at the implementation level, aka physical. Neither the conceptual level nor the logical level has any concept of a current location, for example.

> > It is worth mentioning that 1) programmers are
> > almost always of the belief that they understand
> > the performance characteristics of their code
> > even without measuring it, and that 2) they are
> > almost always wrong.
>
> That is why mgmt or old ladies might say something like "build
> security in, be smart about performance, but let's address what we
> really find as the performance problems in future iterations, before
> we go live, of course, rather than doing any handstands based on
> speculation." We should not be careless, however, as there are still
> many projects that fail for reasons of performance.

It is standard practice where I work to test for scalability and performance prior to the launch of a product. One doesn't know anything until one measures.

> > The compiler/VM/query optimizer
> > can do better than a human can, almost always, and
> > it never gets distracted by thinking about navigation
> > or Scarlett Johansson.
>
> If it is as bad to pop up to a conceptual level when coding as to
> think about Scarlett Johansson, then I'll try to think about her
> instead too. I don't quite catch the fun of doing so however, so clue
> me in ;-)

For futher investigation I suggest Google Image Search. However there may be a hardware issue in your case.

> > So, you could make an argument about there being a difference; in
> > practice I would be astonished if there was one.
>
> Hmm. Ok. But is there something wrong with doing the "navigation"
> that I did with my statements?

In this particular simplified case it doesn't matter one way or the other, from a performance standpoint. In a real-world scenario, it may matter a lot.

> > Index scans are ridiculously fast. Any difference is speed has
> > been less than the difference from one query run to the next;
> > it's in the noise. And the more complicated the query is, the
> > harder your technique will be and the proprtionally smaller any
> > already small effect the removal of the index scan will be.
>
> But my technique was not difficult at all and I wasn't in a big,
> conscious way attempting to save on performance, it was a by-product
> of the approach instead.

There was no by-product; you didn't save on performance. Furthermore as Tony mentioned you have now stamped your query plan into concrete, and made it impossible for the system to adapt its execution to changing circumstances.

> At each "node" I use set-based logic to get
> the desired result sets. If I have information to pass to an SQL
> query from a prior query, I use it to "navigate" to that next node or
> set of nodes.

Yes you did. You may hereby consider yourself as having been smacked by a salmon.

> Whether working with VSAM files, Pick, or SQL-DBMS's,
> this approach has worked and has never been contested by anyone as
> being either difficult to maintain or to perform poorly.

If your tables are a hundred rows it makes no noticable perfomance differences if you fetch them one at a time or all at once. There's no perceptible difference between an O(n log n), O(n^2), or O(n^3) algorithm
if n is small enough. As n grows, so does the difference.

If you have a three level deep static hierarchy, with 10^4 rows in the outer table and 10^4 fanout at each level, and you want to aggregate across the lowest level grouped by the highest level, the difference between doing a sum in a single query and a nested lookup execution is a factor of 10^8 network bandwidth and 10^12 packet count. It can mean the differerence between getting an answer in 5 seconds vs. days.

I have encountered much larger problems in the field; this is not even extreme.

Marshall Received on Fri Feb 16 2007 - 02:04:29 CET

This message: [ Message body ]
Next message: paul c: "Re: OT: Where? and What?"
Previous message: David BL: "Re: OT: Where? and What?"
In reply to: dawn: "Re: Navigation question"
In reply to Marshall: "Re: Navigation question"
Next in thread: dawn: "Re: Navigation question"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message