Re: A new proof of the superiority of set oriented approaches: numerical/time serie linear interpolation

From: Cimode <cimode_at_hotmail.com>
Date: 29 Apr 2007 22:58:09 -0700
Message-ID: <1177912689.747957.183590_at_l77g2000hsb.googlegroups.com>


On 30 avr, 05:16, paul c <toledobythe..._at_oohay.ac> wrote:
> Bob Badour wrote:
> > David Cressey wrote:
>
> >> "Brian Selzer" <b..._at_selzer-software.com> wrote in message
> >>news:jy5Zh.4247$uJ6.3542_at_newssvr17.news.prodigy.net...
>
> >>> "Cimode" <cim..._at_hotmail.com> wrote in message
> >>>news:1177863186.258396.136950_at_y5g2000hsa.googlegroups.com...
>
> >>>> On 29 avr, 16:21, "Brian Selzer" <b..._at_selzer-software.com> wrote:
>
> >>>>> I would have done it differently. I would have used a cursor to
>
> >> generate
>
> >>>> [Snipped]
> >>>> I respectfully think you miss the point of the entire thread.
> >>>> Thinking procedurally (use of cursors) is inherently inferior to set
> >>>> oriented approaches independently of a specific implementation and
> >>>> regardless of how one defines performance as response time, ability to
> >>>> run well in concurrent query environment and so forth.
>
> >>> You're right. As a rule, set-based thinking is superior to procedural
> >>> thinking. But there are times when it becomes necessary to break the
>
> >> rules.
>
> >>> It is critical to think procedurally in order to understand the
> >>> execution
> >>> plans that an optimizer produces.
>
> >> This is correct, but it largely misses the point. One thinks about
> >> how the
> >> optimizer does its job in the (hopefully rare) cases where the
> >> optimizer's
> >> output is materially suboptimal.
>
> >> My own experience some 15 years ago with DEC Rdb is a case in point. The
> >> DEC optimizer was so good that, in most cases, it was simply not
> >> necessary
> >> to think about how it worked. That allowed the query designer (and
> >> for that
> >> matter the database designer) to focus on the other aspects of quality
> >> without sacrificing much in the way of speed.
>
> >>> Consider a merge join: both inputs are
> >>> sorted, and then a single pass is made through each result--similar
> >>> to an
> >>> old COBOL program that updates a master file from a transaction
> >>> file. If
> >>> you don't understand what each step in an execution plan does and how it
> >>> works, then how can you possibly optimize a query?
>
> >>> When it becomes clear that an iterative approach could eliminate several
> >>> self-joins, thus decreasing the duration that locks are held,
> >>> reducing the
> >>> impact on temporary storage and decreasing cpu utilization, then it
> >>> might
>
> >> be
>
> >>> a good idea to break the rules. You mentioned that switching from a
>
> >> poorly
>
> >>> designed procedural solution to one that is set-based dropped the
> >>> time it
> >>> took for the process to run from 10 hours to 30 minutes. That's
> >>> great! I
> >>> merely suggest that reintroducing a cursor to eliminate the several
> >>> self-joins in your solution might further decrease the time it would
> >>> take
> >>> for the process to run--without any negative impact on concurrency or
> >>> overall performance.
>
> >> All of these comments should lead to one conclusion: one engages in this
> >> sort of analysis when one is intent on writing a better optimizer, or
> >> when
> >> coming up with a workaround in the case of a deficient optimizer. Not in
> >> the case of using a good optimizer to good effect.
>
> > David, please don't be too dismayed when I say: Lately, I have been
> > liking your answers better and better.
>
> I'd like to join that club. But as an old fan of serials and the
> Mythical Man-Month", I can't help but think of silver bullets. My
> favourite serial was "Captain Midnight and the Silver Dart". Years
> after I watched that one, I saw the original set on television and was
> dismayed. What looked modern and sophisticated to a 'fifties kid was in
> fact silver-painted cardboard.
>
> Years after that I stumbled upon a basic truth which has nothing to do
> with theory but everything to do the end-users of theory. When
> requirements are expressed in terms that the
> customer/user/client/interested party or so-called "stakeholder" can
> understand, all bets are off. With the right viewpoint, one can get the
> big cheese to toss in the garbage a requirement that all the long-term
> underlings will claim are essential and important. This even applies to
> issues of human safety, such as airline cargoes. Having stalled while
> travelling on more than a few UK trains after they were privatized (one
> locomotive almost self-destructed), I would think that when it comes to
> concurrency I'd be quite happy to dispense with concurrent access by
> accounting bureaucrats in favour of the tensile/structural integrity
> users. I don't see why a track sensor interpolation shouldn't be
> isolated from the rest of the db while it is going on, for any train,
> let alone the TGV. The biggest thing that has bugged me about most
> consultants is that they usually don't question requirements.
>
> p

Agreed. I would also really appreciate your insight about the question I was trying to pinpoint through this example. Do you believe that linear numeric/datetime interpolation could represent some way of hadling missing data ? That's the secondary issue I was trying to get to besides the issue of demonstrating how set oriented approaches are indeed superior to procedural. Received on Mon Apr 30 2007 - 07:58:09 CEST

Original text of this message