Re: Navigation question

From: dawn <dawnwolthuis_at_gmail.com>
Date: 25 Feb 2007 19:45:49 -0800
Message-ID: <1172461549.849566.309620_at_j27g2000cwj.googlegroups.com>


On Feb 25, 4:58 pm, "dawn" <dawnwolth..._at_gmail.com> wrote:
> On Feb 23, 10:10 am, "Walt" <wami..._at_verizon.net> wrote:
>
> > "dawn" <dawnwolth..._at_gmail.com> wrote in message
<snip>
> I plan to take mAsterdam's advice and re-read all of the responses
> (attempting to ignore BB's, which unfortunately can still serve to
> drag me down personally, in spite of my attempts to push them aside).
> I will also go back to some of quips (here) and earlier documents that
> prompted me to ask the question. I will see if I can better
> understand what aspects of the arguments have anything at all to do
> with my two database navigation examples (code navigating and DBMS
> navigating based on metadata specs and query declarations).
>
> My assessment to date is still that database navigation, the way I am
> discussing it, is not the evil it might have been purported to be.
> There is always the possibility that BB is right and I am still very
> confused on this point, as I certainly have been in the past. Thanks
> for your help! --dawn

OK, after a bit of reading and being at only 4 out of 8 so far on the academy awards (but tie with the hubby so far), I tracked down one of the writings that has high enough regard, I suspect, but didn't exactly impress me on the statement regarding navigation. There must be better write-ups than this on why not to navigate, but this one was referred to from another one I re-read and is the better of the two.

>From "Third-generation database system manifesto"
ISSN:0163-5808, p.37-38 SIGMOD RECORD Vol 19 No 3 Sept 1990

"The navigational point of view is well articulated in the Turing Award presentation by Charles Bachman. We feel that the subsequent 17 years of history has demonstrated that this kind of interface is undesirable and should not be used. Here we summarize only two of the more important problems with navigation. First, when the programmer navigates to desired data in this fashion, he is replacing the function of the query optimizer by hand-coded lower level calls. It has been clearly demonstrated by history that a well-written, welltuned,  optimizer can almost always do better than a programmer can do by hand. Hence, the programmer will produce a program which has inferior performance. Moreover, the programmer must be considerably smarter to code against a more complex lower level interface.

However, the real killer concerns schema evoluation. If the number of indexes changes of the data is reorganized to be differently clustered, there is no way for the navigation interface to automatically take advantage of such changes. Hence, if the physical access paths to data change, then a programmer must modify his program. On the other hand, a query optimizer simply produces a new plan which is optimized for the new environment. Moreover, if there is a change in the collections that are physically stored, then the support for views prevalent in second generation systems can be used to insulate the application from the change. To avoid these problems of schema evolution and required optimization of database access in each program, a user should specify the set of data elements in which he is interested as a query in a non-procedural language. "

The authors start out with a seemingly typical statement during this argument that history has decided this point. Beyond this initial hand-waving are two arguments.

  1. Performance. My response is two-fold a) Sometimes we want to choose the best performance and sometimes that is not the ultimate requirement. Performance is but one of possibly many requirements when choosing a design pattern, howbeit an important one. b) As someone who has crafted queries against the very same type of information in multiple DBMS's using multiple data models, with a similar (but not identical) volume of data, in every case when executing a real query on lesser hardware against the non-SQL DBMS, the performance was better than the analogous SQL query. These were real queries where real people wanted the results, however, and I do not doubt that there are times when a SQL query could "win" such a competition. It is a shame that industry performance metrics for DBMS's seemed to be tied to the SQL language, and I know of no public information comparing real queries of the same data in multiple models. In the absence of statistically significant data, I'm inclined to accept my own experience on this one.

2 is the one that really confuses me.
Data can be split on multiple volumes, have indexes added or changed, etc and nothing has to change in the applications when specifying navigation. Why does this argument regarding navigation launch into a discussion of what seems to be physical navigation? Am I misunderstanding it?

I recognize there is the down-side that if you specify a navigation path (such as an earlier example where the query had "Order.orderid" to which the DBMS had specs on how to navigate) you are "hard-coding" similar information to that "hard-coded" when issuing an SQL query that includes a JOIN statement. On the plus side, when specifying such information to the DBMS and having the DBMS use it in a query, there is a single point where that "link" information needs to change if the database schema is modified, rather than having to change the JOIN statements in every application employing such.

If this is the best there is against what I have referred to as "logical navigation" then I would say the argument has no meat. Is there something else I should read with better arguments or are there holes in mine? Thanks. --dawn Received on Mon Feb 26 2007 - 04:45:49 CET

Original text of this message