Re: Can relational alegbra perform bulk operations?

From: Banana <Banana_at_Republic.com>
Date: Tue, 29 Sep 2009 15:27:05 -0700
Message-ID: <4AC289B9.9070704_at_Republic.com>


Tegiri Nenashi wrote:
> if you are saying that only those transformations that lead to "shorter" relational
> expressions make sense from optimization perspective, than this is not
> true. E.g. "magic" transformation.

Yes, that's what I was trying to understand and your statement correlates with Clifford's statement that no 'magic' transformation actually exists.

> Are you familiar with cost based optimization? They basically throw in
> multiple equivalent expression and estimate cost of running each
> apriori.

Yes, but I suspect I had a mistaken notion of what it actually did; I thought they basically selected an expression that was known to be cheaper than other equivalent expression but as there is no magic transformation, it's probably basically a guessimate (though I'm quite sure it's more complicated/sophisticated than that).

>> However, could I ask you to explain the statement that tuples aren't
>> referred at all in relational algebra?

>
> Simple. Take projection for example. It contains the pi symbol
> together with attributes names. Ditto selection and the others.
>

Just to validate that I've understood your point which I'm not sure I did. Projection need not refer to the tuples because we can just present an expression of attribute and the pi symbol. (But wouldn't one still have to look at the original value to arrive at the resulting value + pi?)

> That is called nested loops join. It is implementation; algebraically
> (natural) join is idempotent, symmetric, and associative operation.

Thank you. I've actually heard of the term (both without an index and with an index), but mistakenly assumed that it was a DBMS-specific thing as I don't recall hearing it outside that context.

> In this respect Bob's answer was spot on: while nested loops is quite
> inefficient operator, the indexed nested loops isn't.

Agreed, that's basically what I've had understood anyway. My apologies for not having made it clear in my earlier description of the relvars. Even so with benefit of an index, one would still have to iterate over each tuple to evaluate the expression. In other words, the theory does not provide a mean of resolving a set of tuples in a single operation without resorting to the nested loops, and thus various vendors has had to invent different ways not directly specified or implied by the theory to help optimize on the process (e.g. clustering to enable a range seek upon the index for instance). Is that accurate?

And so you know, I do appreciate you taking your time to explain those things to the uninitiated. Received on Wed Sep 30 2009 - 00:27:05 CEST

Original text of this message