Re: A Topological Relational Algebra in Lisp
Date: Thu, 22 Jan 2015 21:20:09 +0100
Message-ID: <m9rm0r$144$1_at_dont-email.me>
Tegiri Nenashi wrote:
> OK, now it is clear: the objects you are focusing on are pairs of
> multivariate relation X in the first component, and binary relation R in the
> second.
>
> Now you can define operations in Norbert algebra "component-wise". For
> example you can define "multiplication" as natural join in the first
> component and composition in the second. This would be not entirely without
> merit, because multiplication would be commutative and associative operation.
> However, natural join in [multivariate] relational algebra is idempotent,
> while composition in algebra of binary relations is not. Therefore, I'm not
> certain how well defined your algebra is. Can you please describe each and
> every of the following operations: natural join (as it might be something
> other than my guess), projection, and union?
OK, but first I will go into some mathematical principles:
Given two relational representations (X,R) and (Y,S) of topologiacal spaces (X,T(R)) and (Y,T(S)) then a function
f : X -> Y
is continuous iff for every pair (a,b) in R the image pair (f(a),f(b)) is an element of S^*, the transitive and reflexive closure of S, hence, iff
(a,b) in R => (f(a),f(b)) in S^*
holds.
For example, f may be a foreign key such that each tuple tx in X references a
tuple f(tx) in Y.
Note that here the smaller R ("finer") or the bigger S ("coarser")
the more likely is it that f is continuous.
Trivially, the f1,f2,f3 are always continuous with the empty relation {} subset X x X and the g1,g2,g3 are always continuous with the universal relation X x X. So there always exists a relation that makes all functions involved continuous.
So we are either looking for a maximal relation Rx for X that makes all fs continuous or we are looking for a minimal relation Rz that makes all gs continuous.
This can be done component-wise, for each function: Let R be the relation associated with Y. Hence op queries the spaces (Y1,R1), (Y2,R2), (Y3,R3) to get (X,Rx) or else (X,Rz).
The final case gives:
R1 = { (f1(a), f1(b)) | a R b } R2 = { (f2(a), f2(b)) | a R b } R3 = { (f3(a), f3(b)) | a R b }
Then a result relation is Rz = R1 u R u R3 (where "u" is set union). So this case is easy and can be efficiently computed.
The initial case gives:
R1 = { (a,b) in X x X | g1(a) R+ g1(b)) }
R2 = { (a,b) in X x X | g2(a) R+ g2(b)) }
R3 = { (a,b) in X x X | g3(a) R+ g3(b)) }
Then a result relation is Rz = R1 /\ R /\ R3 (where /\ is set intersection).
Here R+ is the transitive closure on R. So this case is more expensive and
one should take more care in efficiency. Whereas transitive closure always
gives a correct relation one may reduce the result relation to get
Rx' = OPEN(Rz) by removing tupels thet do not affect the topology.
For example
Ri = { (a,b) in X x X | a != b, gi(a) R+ gi(b)) } is smaller but topologically equivalent to
Ri = { (a,b) in X x X | gi(a) R+ gi(b)) }.
Every standard query operator from Relational Algebra falls into one of these two categories. The renaming query which just renames attributes and leaves the data unaffected even falls in both categories.
natural join:
assume two relations R(a,b,c) and S(c,d,e). I will not specify the key to simplify the presentation and, hence, take (a,b,c) and (c,d,e) , which are superkeys, as keys.
Then RS(a,b,c,d,e) is the schema of the natural join Each tuple (a,b,c,d,e) in RS has a projection
pR(a,b,c,d,e) := (a,b,c) in R
and
pS(a,b,c,d,e) := (c,d,e) in S
So we have two functions
pR : RS -> R and pS : RS -> S
that map each result tuple back to its "origin" within the input relations. As the result RS is at the domain side of each of these functions we have an /initial/ query. So when the relations R ans S have binary relations T(ia,ib,ic, da,db,dc) and Q(ic,id,ie, dc,dd,de) associated such that (R,T) and (S.Q) represent topological spaces, then a straight-forward result relation TQ for the topology is
TQ(ia,ib,ic,id,ie,da,db,dc,dd,de)
- { (ia,ib,ic,id,ie,da,db,dc,dd,de) | (ia,ib,ic) T^+ (da,db,dc) andalso (ic,id,ie) Q^+ (dc,dd,de) }.
Then (RS,TQ) represents the fiber product space of the topological spaces represented by (R,T) and (S,Q). Of course the relatton TYQ may be "trimmed" by projecting only onto the key attributes of TQ and by removing "spurious" tuples (ix,dz) in TQ if the tuples (ix,y) and (y,dz) exist in TQ. (Codd called this operator "OPEN").
projection:
Projection is simple, as it is a final operator.
Take a relation X(a,b,c) and project on a,c, say p[a,c](x,y,z) = (x,z).
Again, I take all attributes as (super)-keys.
Then Y = p[a,c](X) is the query result table.
So p[x,y] is the function that takes a query input tuple (x,y,z) to the
query result (x,z).
When R represents a topolgoy for X, it defines which pairs
(t1,t2) in X x X are "topologically" close.
The query result query merely the image of R under p[a,c]:
If R(ia,ib,ic,da,db,dc) encodes the binary relation (ia,ib,uc) R (da,db,dc) then the prokction pi[ia,ic,da,dc](R) gives the query result topolgoy for the space (X,R).
union:
union is also a final operator.
Let (X,R) and (Y,S) be representations of topological spaces and XuY be the union of X and Y. Then we have the two inclusion functions
ix : X -> XuY , iy : Y -> XuY ,
which are simply defined as ix(s) = s abd iy(t) = t. The result topology is represented by RuS. Hence
(X,R) u (Y,S) = (XuR, RuS) .
If R and S are not union compatible they can easily made so.
Note that intersection is a special case of the natural join:
(X,R) /\ (Y,S) = (X/\R, R^+ /\ S^+) .
Here, in general, transitive closure is ncessary.
I hope this gives an idea if the operations and their topological
ackground.
> This gives me a flavor how you define selection. However, selection can be
> considered as a special case of natural join. Here I see that my earlier
> guess about natural join defined as composition in the second component is
> simply wrong.
Selection is an initial operator:
Les s[p] be the selection operator with predicate p.
s[p](Y) on a relation Y gives a subset X = s[p](Y)
If R is a relation for Y and (Y,R) represents a topological space, then the result topology T(R)|_{s[p](Y) (subspace topology) is represented by
sR = { (a,b) in R+ /\ X^2 } .
This is the restriction of the transitive closure on R to the selected tuples. The selection operator s[p] (with predicate p) applied to a space gives
s[p](Y,R) = (s[p](Y), sR) .
Here again, in general, transitive closure must be possibe to always give corerct query results. Otherwise the spaces involved must have a constant upper dimension limit (say, 42, which means that queries on 43-dimensional spaces may give wrong results).
> "group by" generalizes projection. Does Quotient Space generalize Final
> Topology?
I'd say it the other way: Final Topology generalizes all final operators: projection, quotient, union.
> Since you offered definition of topological relation as ordered pair of
> multivarite relation together with binary relation, you can specify
> operations in your algebra without referring to topology. Again, I'm
> interested to see formal definitions of natural join, projection, and union.
I hope you have seen them above ;).
> Again, can you please take the (Rooms,RoomNeighbourhood) pair from your
> example, then introduce one more pair and, next, work out the natural join of
> these two pairs (is it something meaningful)?
The natural join of two topological spaces to me is extremely interseting.
Now a third table may define aggregations I (the "index") among details D:
with relations
RI = iid bdid
Now the example space S may reference these detail aggregates
(via a foreign key from S to I):
object detail
That mapping is continuous.
I = id comment
--------------------------
int an interior space
idoor an interior door
xdoor an entry door
ext the exterior
int idoor
int xdoor
ext xdoor
room1 -> int
door -> xdoor
outside -> ext .
object detail
room1 -> int door -> idoor outside -> ext
is not continuous, because the relation RD that represents the topology of D contains the pair (outside, door). The foreign key then maps this pair to (ext, idoor) which is not in the transitive and reflexive closure of RA. Hence "continuity of foreign keys" could be a helpful instrument for CAD systems. Id does not allow to put an interor at the common boundary between interior and exterior.
Now when the details as a space (D,RD) map to their aggregates the following way (again, say, via a foreign key to I):
id detail
interior int exterior ext surface| idoor surface|| xdoor idoormat xdoor xdoormat xdoor
then the natural join (S,RS) >< (D,RD) = (SD, RSD) gives:
------------->|<====================>||<-------------------- (ascii-"art") (room1) (door) (door) (door) (outside) (int) (xdoor) (xdoor) (xdoor) (ext)(interior) (xdsurf|) (xdoormat) (surface||) (exterior)
Whis is denoted with vertically stacked tuples. In tables this looks loke:
SD S.id I.id D.id ----------------------------- room1 int interior door xdoor xdsurf| door xdoor xdoormat door xdoor surface|| outside ext exterior
The topology is then generated by
RSD S.iid I.iid D.iid S.bdid I.bdid D.bdid ----------------------------------------------------- room1 int interior door xdoor xdsurf| door xdoor xdoormat door xdoor xdsurf| door xdoor xdoormat door xdoor surface|| outside ext exterior door xdoor surface||
For example the tuple (door, xdoor, xdsurf|) is in the topological closure of the set {(room1, int, interior}).
The applicaton of the details (D,RD) to the sketch (S,RS) gives a detailled version of (S,RD). So natural join could be an interesting operator to organize levels of detail (LoD) in CAD system where the same detail is repeatedly used. This may be extended to a relational generalization of CAD-macro (or BLOCK) objects.
For the topologistst: the above natural join (SD,RSD) = (S,RS)><(D,RD) its the following pullback (S,RS) x_I (D,RD)
(SD,RSD) --------p_D-------> (D, RD)
| | | | | | p_S FK_D | | | | V V
(S, RS) -------FK_S-------> (I, RI)
with projections p_D and p_S and continuous foreign keys FK_D and FK_S
This "details library" concept still has some open questions because in some situations details can get lumped together too much. This problem has yet to be solved, but I consider it only a not too difficult database design problem for an extremely interesting practical application.
Hope that clarifies.
Norbert Received on Thu Jan 22 2015 - 21:20:09 CET