Re: A new proof of the superiority of set oriented approaches: numerical/time serie linear interpolation

From: Cimode <cimode_at_hotmail.com>
Date: 30 Apr 2007 00:58:30 -0700
Message-ID: <1177919910.498110.52190_at_q75g2000hsh.googlegroups.com>


On Apr 30, 4:16 am, paul c <toledobythe..._at_oohay.ac> wrote:
> Bob Badour wrote:
> > David Cressey wrote:
>
> >> "Brian Selzer" <b..._at_selzer-software.com> wrote in message
> >>news:jy5Zh.4247$uJ6.3542_at_newssvr17.news.prodigy.net...
>
> >>> "Cimode" <cim..._at_hotmail.com> wrote in message
> >>>news:1177863186.258396.136950_at_y5g2000hsa.googlegroups.com...
>
> >>>> On 29 avr, 16:21, "Brian Selzer" <b..._at_selzer-software.com> wrote:
>
> >>>>> I would have done it differently. I would have used a cursor to
>
> >> generate
>
> >>>> [Snipped]
> >>>> I respectfully think you miss the point of the entire thread.
> >>>> Thinking procedurally (use of cursors) is inherently inferior to set
> >>>> oriented approaches independently of a specific implementation and
> >>>> regardless of how one defines performance as response time, ability to
> >>>> run well in concurrent query environment and so forth.
>
> >>> You're right. As a rule, set-based thinking is superior to procedural
> >>> thinking. But there are times when it becomes necessary to break the
>
> >> rules.
>
> >>> It is critical to think procedurally in order to understand the
> >>> execution
> >>> plans that an optimizer produces.
>
> >> This is correct, but it largely misses the point. One thinks about
> >> how the
> >> optimizer does its job in the (hopefully rare) cases where the
> >> optimizer's
> >> output is materially suboptimal.
>
> >> My own experience some 15 years ago with DEC Rdb is a case in point. The
> >> DEC optimizer was so good that, in most cases, it was simply not
> >> necessary
> >> to think about how it worked. That allowed the query designer (and
> >> for that
> >> matter the database designer) to focus on the other aspects of quality
> >> without sacrificing much in the way of speed.
>
> >>> Consider a merge join: both inputs are
> >>> sorted, and then a single pass is made through each result--similar
> >>> to an
> >>> old COBOL program that updates a master file from a transaction
> >>> file. If
> >>> you don't understand what each step in an execution plan does and how it
> >>> works, then how can you possibly optimize a query?
>
> >>> When it becomes clear that an iterative approach could eliminate several
> >>> self-joins, thus decreasing the duration that locks are held,
> >>> reducing the
> >>> impact on temporary storage and decreasing cpu utilization, then it
> >>> might
>
> >> be
>
> >>> a good idea to break the rules. You mentioned that switching from a
>
> >> poorly
>
> >>> designed procedural solution to one that is set-based dropped the
> >>> time it
> >>> took for the process to run from 10 hours to 30 minutes. That's
> >>> great! I
> >>> merely suggest that reintroducing a cursor to eliminate the several
> >>> self-joins in your solution might further decrease the time it would
> >>> take
> >>> for the process to run--without any negative impact on concurrency or
> >>> overall performance.
>
> >> All of these comments should lead to one conclusion: one engages in this
> >> sort of analysis when one is intent on writing a better optimizer, or
> >> when
> >> coming up with a workaround in the case of a deficient optimizer. Not in
> >> the case of using a good optimizer to good effect.
>
> > David, please don't be too dismayed when I say: Lately, I have been
> > liking your answers better and better.
>
> I'd like to join that club. But as an old fan of serials and the
> Mythical Man-Month", I can't help but think of silver bullets. My
> favourite serial was "Captain Midnight and the Silver Dart". Years
> after I watched that one, I saw the original set on television and was
> dismayed. What looked modern and sophisticated to a 'fifties kid was in
> fact silver-painted cardboard.
>
> Years after that I stumbled upon a basic truth which has nothing to do
> with theory but everything to do the end-users of theory. When
> requirements are expressed in terms that the
> customer/user/client/interested party or so-called "stakeholder" can
> understand, all bets are off. With the right viewpoint, one can get the
> big cheese to toss in the garbage a requirement that all the long-term
> underlings will claim are essential and important. This even applies to
> issues of human safety, such as airline cargoes. Having stalled while
> travelling on more than a few UK trains after they were privatized (one
> locomotive almost self-destructed), I would think that when it comes to
> concurrency I'd be quite happy to dispense with concurrent access by
> accounting bureaucrats in favour of the tensile/structural integrity
> users. I don't see why a track sensor interpolation shouldn't be
> isolated from the rest of the db while it is going on, for any train,
> let alone the TGV. The biggest thing that has bugged me about most
> consultants is that they usually don't question requirements.
>
> p

I have to admit I honestly can not disagree with what you are saying. But we can not ignore a context where ignorance takes the edge over truth. We are now facing a systemic problem that is basically making the entire field of database management regressing.

I am trying to bring people to theory through tempting *case study*. Even though I do believe in the superiority of study by formalism as opposed to case study, I do also believe in the quick and dirty effectiveness of case study learning to create doubt in a practitionner's mind and trigger the questions that may lead that practitionner to pull out from utter ignorance and get to some intermediary state of mind. Once the questionning starts, there is at least a chance that people start learning formal theory. If one throws formal theory to uneducated people in a system that promotes ignorance then the chance of people learning is very small.

In other words, I have found out the trend for rejection of formalism is so strong that it is important to establish empowerment of practitioners through case example and question triggering. After all, RM theory is about how to make practical issues bearable to practitionners. We should never loose that from perspective.

PS: it is always gratifying reading your posts. I am strongly tempted to get to pure ra and mathematical formalism (so much simpler so much clearer) and restrict audiences to discuss RM but I do not feel this is the right way to defend the chance to see a TRDBMS some day.

Regards... Received on Mon Apr 30 2007 - 09:58:30 CEST

Original text of this message