Re: Comments on Norbert's topological extension of relational algebra

From: Nicola <nvitacolonna_at_gmail.com>
Date: Sat, 19 Dec 2015 00:40:48 +0100
Message-ID: <n525hv$1h0l$1_at_adenine.netfront.net>


On 2015-12-18 19:25:34 +0000, Tegiri Nenashi said:

> I'm still wanting an example that might clarify topological extension
> of RA. If the proposal is about introducing topological datatype, then
> the extension is quite intuitive, and can explained to database person
> in terms of tuples and attributes.

Yes, I think that the extension is quite intuitive. The underlying idea is very simple: there is a well-known correspondence between a certain class of topological spaces (Alexandrov spaces), which includes the finite topological spaces, and the class of preorders (reflexive and transitive relations). There are effective procedures to build a preorder on a set X given a topology on X, and vice versa. So, instead of considering a set X with a topology T, i.e. a topological space (X,T), one may equivalently use X equipped with a preorder P induced by T. The latter has an obvious relational interpretation. Besides, instead of using P, we may in fact just use a relation R whose reflexive and transitive closure is P. The pair (X,R) is what Norbert calls a *topological datatype*.

The above would be an almost trivial observation: what is interesting (and far from trivial) is that the most common topological constructions, i.e., subspace, space product, space quotient, etc... may then be reduced to relational algebra operations (very elegantly, I'd say). The second interesting thing is that such operations may be performed in many cases directly on (X,R) rather than on (X,P), that is, without explicitly computing a reflexive and transitive closure (and it is never necessary to build (X,T)).

> The first question is what are the values of topological datatype. I
> assume, they are certainly not entire topologies.

Given the equivalence above, they certainly are. Or, to be more precise, they are a lossless representation of topological spaces (through what are called "specialization preorders").

> They are geometric objects with "qualitative" description, so that for
> example cube is not distinguishable from sausage. "Topological
> datatype" means that those objects are from the same topological space;
> in the same venue as values from ordinary datatypes are chosen.
>
> Consider the following relation FavoriteShapes(name, topological_object):
>
> name topological_object
> -------- ------------------
> Jan Dohnut
> Nicola Cube
> Norbert Sausage
> Actually, if one agrees how to choose a representative in every class
> of all topologically equivalent objects, then the FavoriteShapes
> relation has to be rewritten as
>
> name topological_object
> -------- ------------------
> Jan Dohnut
> Nicola Ball
> Norbert Ball

What you mean here is that Ball, Cube and Sausage are homeomorphic. But homeomorphism is a mapping between topological spaces, not between elements of topological spaces. So, for that to make sense in the topological database context, you should represent Ball, Cube and Sausage as topological spaces, or equivalently (as per the above) as topological datatypes - not as elements of a topological space (well, you might represent Ball, Cube and Sausage each with a singleton space, but that would be trivial). One possibility (and a quite natural one, once you get by with what on earth a CW-complex is) is to use CW-complexes.

Let me try a contrived example in 2D using a triangle and a square instead of a sausage and a cube. A triangle and a square may be represented as topological datatypes using CW-complexes where the 0-cells are vertices, the 1-cells are the edges, and 2-cells are the surfaces. Let TRIANGLE be the set of (0-, 1-, and 2-)cells of the triangle, that is, intuitively, the "parts" of the triangle of dimension 0, 1 and 2 respectively:

TRIANGLE
id dim



1 0
2 0
3 0
4 1
5 1
6 1
7 2

(The geometric coordinates of the points may be stored in a different relation, e.g.:

Coords
id x y



1 0 0
2 0 1
3 1 0

but they are not relevant for this example.)

Now, (a topological space for) a triangle t may be represented as follows:

   R_t
id1 id2



 4 1
 4 3
 5 1
 5 2
 6 2
 6 3
 7 4
 7 5
 7 6

which corresponds to this drawing:

2 \
| \
| \
5 6
| 7 \
| \
1---4---3

The pair (TRIANGLE, R_t) is a topological datatype. The corresponding topological space is built by taking the reflexive and transitive closure of R_t as the specialization preorder ≼, and taking all the upper sets wrt ≼ as the open sets (where the convention is that id2 ≼ id1). So:

Set X: {1,2,3,4,5,6,7}
Topology T on X: {∅, {1,2,3,4,5,6,7}, {1,4,5}, {3,4,6}, {4}, {5}, {6}, {7}, {1,3,4,5,6,7}, {4,5,6,7}, {5,6,7}, ... }

It is easy to see, for instance, that the edges are open sets, but the edges with their adjacent vertices are closed sets, as expected.

For a square we may do something similar:

b---g---c
| |
f i h
| |
a---e---d

SQUARE
id dim



a 0
b 0
c 0
d 0
e 1
f 1
g 1
h 1
i 2

   R_s
id1 id2


 e     a
 e     d
 f     a
 f     b
 g     b
 g     c
 h     c
 h     d
 i     e
 i     f
 i     g
 i     h

The pair (SQUARE, R_s) is another topological datatype. A Cube or a Sausage may be represented in a similar fashion (I believe that for the sausage you need a finite approximation).

Starting from TRIANGLE, R_t, SQUARE, R_s you may then build new topological datatypes (i.e., new topological spaces) that combine the triangle and the square in different ways, and to do so you may use RA operations as explained in the paper. Exercise for the reader (I am too tired to try this myself now): how to build a pair (X,R) that represents the following figure?

b---g---c2\
| | \
f i h5 6
| | 7 \
a---e---d1--4--3

I think that you should also be able to formulate a query to verify that (SQUARE, R_s) and (TRIANGLE, R_t) are homeomorphic (another exercise).

A final remark: in practice, it is not feasible to have a separate schema for each object of the real world that we want to model. More realistically, you might want something along these lines (the following schema is just tentative, take it with a grain of salt!):

WORLD
object attr1 ... attrN


sausage    ...       ...
cube       ...       ...
ball       ...       ...

...

 PART
id dim



1 0
2 0
3 0
4 0
5 0
...
n 1
n+1 1
...
m 2
m+1 2
...

    SHAPES
object part_id



cube 1
cube 2
...
cube n
...
sausage ...

...

   R1
id1 id2 object



 n 1 cube
 n 2 cube
 ...
 ...    sausage
 ...    sausage
 ...

   R2
id1 id2



...

The PART relation (which Norbert has called CELLS elsewhere) would contain the data about 0-cells, 1-cells, and so on (these are both the "parts" real objects are made of and the points of a topological space). The specialization preorders (i.e., the topologies) would be represented by R1, R2, ..., possibly storing more than one topological space in each relation, with an additional attribute to distinguish among the different spaces (as I have done for R1, where each object belongs to a different space).

Your initial relation would correspond to WORLD, and it would *not* be part of a topological datatype, but it would provide the description a "high-level", or complex, object (a closed set, for instance). Topological datatypes in this example are (subsets of) pairs (PART, R1) and (PART, R2).

Of course, there are ways to build real objects other than with CW-complexes, but the notion of a topological datatype does not depend on your interpretation of what a point in a topological space is, so the approach is completely general. (Please Norbert correct anything wrong I may have said.)

Nicola

Received on Sat Dec 19 2015 - 00:40:48 CET

Original text of this message