Re: Aggregates: Largest Groups

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Thu, 08 Apr 2010 12:10:26 -0300
Message-ID: <4bbdf1eb$0$12432$9a566e8b_at_news.aliant.net>


Sampo Syreeni wrote:

> On Mar 22, 11:42 pm, da..._at_fetter.org (David Fetter) wrote:
>
>

>>I'd like to know what all the groups of vehicles larger than X whose
>>with average speed over Y is.

>
>
> That's a nasty algorithmic design problem. Doing a greedy, streaming
> heuristic is easy enough by updating a flexible LIFO queue, its length
> and a running sum over it, extending and shrinking it as needed. But
> formally getting a list of all of the contiguous subranges becomes
> difficult, especially if the sequence values are not uniformly
> bounded. My guess is that the simplest algorithm in this case would be
> some sort of backwards-forwards one, revolving around adding/removing
> samples from consideration based on their position in the overall
> magnitude order.
>
> This isn't really a database question. But why isn't it? I think
> mainly because data sublanguages like SQL aren't fit to handle this
> sort of thing. On the one hand that is by design: they've laways
> catered to typical categorical business processing needs at the
> expense of arbitrary (e.g. numerical) computations. That's where your
> typical relational optimizer gets the organization and logical
> structure which makes it so easy to use and efficient. But on the
> other hand, there are lots of problems with order, time series,
> aggregation, topology based computation, open ended recursion and
> processing of gridded ("scientific") data which can be important, have
> in fact been addressed e.g. in Fortran compilers, and should perhaps
> be incorporated into data sublanguages as well. That extends to
> business needs as well -- anybody who's ever had to deal with
> financial time series data can attest to that.
>
> From that viewpoint, it really is a good database theory question why
> data manipulation languages are so poor, unexpressive and inefficient
> at this sort of thing.

Do you have any specific criticism of Tutorial D with respect to the above? Received on Thu Apr 08 2010 - 17:10:26 CEST

Original text of this message