Functions and Relations

From: Aloha Kakuikanu <aloha.kakuikanu_at_yahoo.com>
Date: 17 Nov 2006 12:19:52 -0800
Message-ID: <1163794792.364269.246040_at_m7g2000cwm.googlegroups.com>



> NENASHI, Tegiri wrote:
> How one presents 'sqrt' as a relation in the database ? Please show.

This looks like a good opportunity to post a reworked snippet from my exchange with Marshall.

Function u=f(x,y,z) in general is a predicate Pf(x,y,z,u) with some additional constraint. Given that a predicate is often an infinite relation we have obvious difficulty implementing it. Let's discuss this later, but focus on the benefits of representing functions as relations. We have

  1. Function invocation is a join:

f(x) /\ x=1

2. Function composition is a join:

g(f(x)) <==> g(z,y) /\ f(y,x)

Suppose we have the relation R(x,y) where y=x^2. No function yet, just a relation.

As it has two columns the simplest joins we can think of are:

R /\ `y=9`

R /\ `x=3`

where `y=9` and `x=3` are constant unary relations.

In the first case, we probably want to project the resulting relation to `x`

(R /\ `y=9`) \/ 'x'

in the second case to `y`

(R /\ `x=3`) \/ 'y'

Informally, in the first case we want to know the result of sqrt(9), in the first case just 3^2

How the join is evaluated in both cases? Well, let's go into the second case, because it's easier. As one relation is infinite, the only feasible join method is nested loops. Moreover, we have to start from the relation which is finite, that is `x=3`. Now, there are the two possibilities:
1. Scan the whole R relation and find all the matching tuples. Not feasible too!
2. Find matching tuples by a some kind of index:

create index unique_x on R(x);

(pseudo SQL syntax). This index is not a conventional b-tree of course, as all what is needed to do when x is known, and y is not is just to calculate y by a simple formula x^2. The corollary here is that the function x->x2 is essentially an index.

The other case is only marginally more complex. Likewise, we quickly arrive to the conclusion that the only feasible execution is the indexed nested loops with the scan of the `y=9` as the leading relation. Then, the required index is the function y->sqrt(y).

In a word, functions to the predicates are what indexes to the tables in traditional RDBMS are. In RDBMS world there are index organized tables. In the functions analogy we have "function organized" predicates.

Indexes are not something that is supposed to be exposed to the end user, at least in theory. Indexes should be created/destroyed automatically by RDBMS engine. In todays imperfect world this is done by DBAs.

This little snippet is just an unconventional perspective to Meyer's "design-by-contract" idea. The sqrt() function interface is defined by the assertion y=x^2, and the programmer's job is to write the sqrt() implementation. We see that functions being implementational detail has much in common with indexes being implementation details too.

Here is little more complex example. For x^2+y^2=1, there are 3 access methods:
1. Given x, and y return {(x,y)} if they satisfy the equation, and {} otherwise.
2. Given x return {(x,sqrt(1-x^2)),(x,-sqrt(1-x^2))} or empty set. 3. Given y return {(sqrt(1-y^2),y),(-sqrt(1-x^2),y)} or empty set. Once again the way to represent infinite predicates in future relational systems is via access path specification. Received on Fri Nov 17 2006 - 21:19:52 CET

Original text of this message