Re: Nested Sets vs. Nested Intervals

From: <amado.alves_at_netcabo.pt>
Date: 10 Nov 2005 09:59:50 -0800
Message-ID: <1131645590.464316.97210_at_g14g2000cwa.googlegroups.com>


I'll not give an algebra for RDF here since RAL is readily available and I have my own algebra to worry about. So I'll give you a sketch of the latter instead.

Mneson is a graph-based metamodel of data, based on a graph with the following properties:

  1. Connections are binary, directed, and untyped i.e. unlabelled. Note that being untyped entails not having an explicit identifier. A connection is identified solely by the (ordered) pair of vertices it represents. Hence, there are no duplicate connections. That is, for any pair of vertices (x, y) there can be at most one connection from x to
  2. For a connection from x to y we say that x is a "source" of y and y is a "target" of x.
  3. A vertex is either a datum or a pivot.

(a) A data vertex or datum represents an atomic value of data e.g. an
integer, a real number, a symbol. The value of a data vertex is the identity of the vertex.

(b) A pivot has no intrinsic value but it has nevertheless a unique
identity. In practice a serial number is used to identify pivot vertices.

3. There are no island vertices. There may be, however, island subgraphs.

A database instance is expressed in the base structure by means of special subgraph topologies that represent complex data entities like attributes, lists, sets, etc. [1] A possible definition is as follows:

Attribute. An attribute of name x1 and value x2 is represented by an attribute instance x3 such that x1 connects to x3 and x3 connects to x2. The attribute is connected to its subject x0 by connecting x0 to x3. For example suppose that x5 represents an employee; the fact that x5 has Social Security Number, or SSN (the attribute name) 123456789
(the attribute value) is expressed by connecting: x5 to x6, "SSN" to
x6, x6 to 123456789.

Set. For a vertex x that represents a set, the targets of x represent the members of the set. In parlance we say x is the 'head' of the set. Note this is a different object than the algebraic type set of vertices below.

Etc.

The topologies may overlap, e.g. a vertex x that is an attribute type may also be the member of a set, or even a set itself in which case the respective attribute instance is a member of x.

An algebra (called the Mneson Calculus) is defined on the base graph as follows. The algebraic type is the set of vertices. The operations include the set operations (intersection, union, etc.) and the (unary) adjacency operators T, S for Targets, Sources, with the obvious semantics:

T (X) = {y : y is a target of x, x in X} S (X) = {y : y is a source of x, x in X}

The Mneson Calculus serves as a query language. For example, the following query retrieves the employee with SSN 123456789:

T ({Employees}) ^ S [T ({"SSN"}) ^ S({123456789})]

where ^ means set intersection.

Singleton arguments are very common so we take the liberty of ommiting the {}, for clarity; and we want to focus on algebraic issues so often we also ommit the "" around literals. With these simplifications the expression above becomes

T (Employees) ^ S [T (SSN) ^ S (123456789)]

This example of course assumes a schema where vertex Employees is the head of the set of employees.



[1] These entities are also called "semantic" structures because they are designed to model semantic domains. Note how the semantic structures are in turn modelled in terms of the base untyped graph. So formally we have a chain of domains here and we say "semantic domain" do describe the former level. Sometimes I say "semantic domain" even when it is not necessary to distinguish, for habit I guess, I am sorry
if I sound pedantic, please just ignore the word on those occasions. Received on Thu Nov 10 2005 - 18:59:50 CET

Original text of this message