Re: Navigation question

From: dawn <dawnwolthuis_at_gmail.com>
Date: 16 Feb 2007 05:53:05 -0800
Message-ID: <1171633985.904104.318930_at_p10g2000cwp.googlegroups.com>


On Feb 15, 7:04 pm, "Marshall" <marshall.spi..._at_gmail.com> wrote:
> On Feb 15, 4:11 pm, "dawn" <dawnwolth..._at_gmail.com> wrote:
>
> > On Feb 15, 4:51 pm, "Marshall" <marshall.spi..._at_gmail.com> wrote:
>
> > > Fred, Joe, Engelbart, Desmond, and you are all looking
> > > at the SQL and thinking about something else.
>
> > Actually, I'm thinking conceptually (navigation), then modeling
> > logically (in this case with SQL).
>
> You might think you're thinking at the conceptual level, but
> you're not. You're thinking at the implementation level,
> aka physical. Neither the conceptual level nor
> the logical level has any concept of a current
> location, for example.

I figured it out -- I'm designing for the USER and when designing at the logical level, I'm also designing workflow. After pulling up information about a company, for example, the user goes to one of the orders listed. From there, they might go to a product that is part of a line item on the order. This is similar to users who go to a web site and from there navigate to other web sites. Users navigate. Their workflow includes a navigation path. So, at the conceptual level we can navigate, users navigate, and at the physical level we navigate. But there is one thread of theory within software development that keeps pushing against navigation, as if it were a bad thing.

> > > It is worth mentioning that 1) programmers are
> > > almost always of the belief that they understand
> > > the performance characteristics of their code
> > > even without measuring it, and that 2) they are
> > > almost always wrong.
>
> > That is why mgmt or old ladies might say something like "build
> > security in, be smart about performance, but let's address what we
> > really find as the performance problems in future iterations, before
> > we go live, of course, rather than doing any handstands based on
> > speculation." We should not be careless, however, as there are still
> > many projects that fail for reasons of performance.
>
> It is standard practice where I work to test for scalability and
> performance prior to the launch of a product. One doesn't know
> anything until one measures.

Yup.

> > > The compiler/VM/query optimizer
> > > can do better than a human can, almost always, and
> > > it never gets distracted by thinking about navigation
> > > or Scarlett Johansson.
>
> > If it is as bad to pop up to a conceptual level when coding as to
> > think about Scarlett Johansson, then I'll try to think about her
> > instead too. I don't quite catch the fun of doing so however, so clue
> > me in ;-)
>
> For futher investigation I suggest Google Image Search.
> However there may be a hardware issue in your case.

;-)

> > > So, you could make an argument about there being a difference; in
> > > practice I would be astonished if there was one.
>
> > Hmm. Ok. But is there something wrong with doing the "navigation"
> > that I did with my statements?
>
> In this particular simplified case it doesn't matter one way or
> the other, from a performance standpoint. In a real-world
> scenario, it may matter a lot.

Agreed. It might then be better one way or the other. I asked if there was something wrong with doing the "navigation" that I did and you said that in this case it didn't matter one way or the other, but then added "from a performance standpoint." So, does it matter one way or another from some other standpoint?

> > > Index scans are ridiculously fast. Any difference is speed has
> > > been less than the difference from one query run to the next;
> > > it's in the noise. And the more complicated the query is, the
> > > harder your technique will be and the proprtionally smaller any
> > > already small effect the removal of the index scan will be.
>
> > But my technique was not difficult at all and I wasn't in a big,
> > conscious way attempting to save on performance, it was a by-product
> > of the approach instead.
>
> There was no by-product; you didn't save on performance. Furthermore
> as Tony mentioned you have now stamped your query plan into
> concrete,

No more than either of your SQL statements did. I provided information I had to the SQL statements. The statement decided how to execute.

> and made it impossible for the system to adapt its
> execution to changing circumstances.

How did I do that in a way that you did not with your queries? I provided different criteria, but there is nothing worse about the criteria I provided, is there?

> > At each "node" I use set-based logic to get
> > the desired result sets. If I have information to pass to an SQL
> > query from a prior query, I use it to "navigate" to that next node or
> > set of nodes.
>
> Yes you did. You may hereby consider yourself as having
> been smacked by a salmon.

Because I did something that is wrong according to some religion? Or by what criteria? I'm still not hearing what, precisely, is amiss with navigating.

Additionally, it seems that it is OK to "navigate" as long as the data goes back to the user and gets new user input, so that if the user selects an order, one we put to the screen from a result set, then it is OK to navigate to that order (as long as we don't think of it that way as we wouldn't want to think like the user, perhaps?). So I would think that if some other event occured to provide such input, we could also navigate based on that input.

> > Whether working with VSAM files, Pick, or SQL-DBMS's,
> > this approach has worked and has never been contested by anyone as
> > being either difficult to maintain or to perform poorly.
>
> If your tables are a hundred rows it makes no noticable perfomance
> differences if you fetch them one at a time or all at once. There's
> no perceptible difference between an O(n log n), O(n^2), or O(n^3)
> algorithm
> if n is small enough. As n grows, so does the difference.

Please zero in on the specific topic of navigation. I do understand the set processing concerns and iteration. That is NOT the topic, nor does the topic need to navigate there.

> If you have a three level deep static hierarchy, with 10^4 rows in
> the outer table and 10^4 fanout at each level, and you want to
> aggregate across the lowest level grouped by the highest level,
> the difference between doing a sum in a single query and a nested
> lookup execution is a factor of 10^8 network bandwidth and
> 10^12 packet count. It can mean the differerence between
> getting an answer in 5 seconds vs. days.
>
> I have encountered much larger problems in the field; this
> is not even extreme.
>
> Marshall

This is not about network bandwidth. I've been doing client-server development for quite some time and am fully aware of round-trip costs. Please note that I did no more round trips with my example than you did with yours. I navigated, however, where one could suggest that you did not. --dawn Received on Fri Feb 16 2007 - 14:53:05 CET

Original text of this message