Re: Can relational alegbra perform bulk operations?

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Wed, 30 Sep 2009 10:08:02 -0300
Message-ID: <4ac35836$0$23748$9a566e8b_at_news.aliant.net>


Banana wrote:

> compdb_at_hotmail.com wrote:
>

>> Of course optimizers use algebraic identities to rewrite.
>> They just also use other information like size and indices
>> to create a tree with some optimized property.
>> The biggest time expense tends to be
>> secondary storage page access.
>>
>> It might seem such things have to be implemented as loops,
>> but it all depends on your implementing machine.
>> Maybe it is a parallel processor.
>> If you want the longest stick of spaghetti you just hold
>> them all upright on the counter.

>
> Yes, I can easily see how one can select the longest stick of spaghetti
> and to contrive the analogy a bit, we can slide it through a plate set
> at a certain height and thus filter out the taller spaghetti, then one
> more plate at a lower height and discard those that passed the 2nd
> filter leaving us with spaghetti between x and y. In other words, range
> seek. Going back to the relvar, it only applies to one relvar (not to
> say one can't use the same filters for two different relvar under
> certain condition as Bob provided in his example of customer and state
> relations being filtered by 'AK' first prior to being joined together.
>
> What I was really interested in learning whether join (or any operations
> involving more than one relation) could be processed in similar manner
> without necessitating loops across a set of tuples within a relation to
> determine what tuples from other relation satisfy the condition. Based
> on what I'm picking up (and I hope I'm following along) such thing is
> unavoidable.

The thing to keep in mind is, in the end, one has to implement the dbms on physical hardware. With current computational models and storage devices, iteration is unavoidable because everything is based on some linear address space or another. That doesn't mean we won't ever have devices capable of set level operation. It just means we don't today.

I am reminded of a remark in the Dragon Book (Aho et al.) about optimization. After giving methods for a variety of different compiler optimizations, the authors remarked that often algorithmic replacement achieves far more improvement than all of the available optimizations combined. e.g. replacing a bubble sort with a quicksort. They go on to remark that algorithmic replacement remains an unachievable goal.

Using a high-level, declarative, algebraic specification like the RM changes the task from algorithmic replacement to algorithm discovery or generation, which is a much easier task. Received on Wed Sep 30 2009 - 15:08:02 CEST

Original text of this message