matrix encoding IS adjacency list

From: Vadim Tropashko <vadimtro_invalid_at_yahoo.com>
Date: 15 Sep 2005 10:49:23 -0700
Message-ID: <1126806563.166058.48220_at_o13g2000cwo.googlegroups.com>

As it has been written elsewhere, Nested Intervals gradually evolved into matrix encoding. To summarize, each node of a tree is encoded with 4 integers, which translates into the following schema design:

table MatrixTreeNodes (

   a11 integer,
   a12 integer,
   a21 integer,

a22 integer
);

As the title of the message says, there is adjacency list relation hidden there! Let see, the n-th child
[[a11_child,a12_child][a21_child,a22_child]] of the node [[a11_parent,a12_parent][a21_parent,a22_parent]] is calculated by the formulas

a11_child = a11_parent * n + a12_parent
a12_child = a11_parent
a21_child = a21_parent * n + a22_parent
a22_child = a21_parent

Therefore we could enhance the schema design with constraints

table MatrixTreeNodes (

   a11 integer,
   a12 integer,
   a21 integer,

a22 integer
);

alter table MatrixTreeNodes
ADD CONSTRAINT fk_adjacency FOREIGN KEY (a12,a22)

REFERENCES jobs (a11,a21) ON DELETE ...; ADD CONSTRAINT fk_determinant CHECK ( abs(a11*a22-a21*a12)=1 );

Queries:
^^^^^^^

In principle, ancestors chain could be queried symmetrically to the descendants. For performance reasons we take a totally different approach and split the problem into two parts: computing all the node encodings in the chain first, and extracting all the nodes by the keys from the database, second.

For a child node [[a11_child,a12_child][a21_child,a22_child]] we calculate parent encoding as

a11_parent = mod(a11_child,a12_child)
a11_parent = a12_child
a21_parent = mod(a21_child,a22_child)
a21_parent = a22_child

and continue navigating the chain of nodes till the root (when a22=0).

For example, given the node [[49, 9], [38, 7]], we find its parent [[9,
4], [7, 3]], and grandfather [[4, 1], [3, 1]] and grandgrandfather [[1,
1], [1, 0]], which happens to be the root. We collect all those nodes

into a temporary table Ancestors and query the table as

select n.*
from MatrixTreeNodes n, Ancestors a
where n.a11=a.a11 and n.a12=a.a12 and n.a21=a.a21 and n.a22=a.a22

2. Finding descendants is done via Nested Intervals. There are about 5 pages of detailed explanations how to do that in the book I'm writing, but here is a cookbook recipy:

i) Enhance the schema with function based indexes:

CREATE INDEX idx_left ON MatrixTreeNodes(

case when a11*a22 < a12*a21 then

a11/a21
else

(a11+a12)/(a21+a22)
end
);

CREATE INDEX idx_right ON MatrixTreeNodes(

case when a11*a22 > a12*a21 then

a11/a21
else

(a11+a12)/(a21+a22)
end
);

ii) Query descendants with

select descendant.*
from (

select a11 b11, a12 b12, a21 b21, a22 b22 from MatrixTreeNodes
) /*descendant B*/, (

select a11, a12, a21, a22,

          a11*a22 - a12*a21 det,
          case when a11*a22 < a12*a21 then
               a11/a21
          else
               (a11+a12)/(a21+a22)
          end leftEnd,
          case when a11*a22 > a12*a21 then
               a11/a21
          else
               (a11+a12)/(a21+a22)
          end rightEnd

from MatrixTreeNodes
) /*ancestor A*/

where a11*b11*det > a12*b21*det    -- X11 > 0
and   a22*b12*det > a12*b22*det    -- X12 > 0
and   a11*b21*det > a21*b11*det    -- X21 > 0
and   a11*b22*det >= a21*b12*det    -- X22 >= 0

and leftEnd-0.0000001 < a11/a21
and leftEnd-0.0000001 < (a11+a12)/(a21+a22) and a11/a21 < rightEnd+0.0000001
and (a11+a12)/(a21+a22) < rightEnd+0.0000001 and node.name = ... -- predicate uniquely identifying a node

The constant 0.0000001 is designed to compensate for floating point arithmetic rounding errors. It essentially is a minimal supported mantissa. Please refer to your favorite database SQL manual in order to find out the exact value. This way index range scan would capture all the nodes in the interval and, possibly, some more . Then, the (small) list of nodes is filtered with the exact condition. Received on Thu Sep 15 2005 - 19:49:23 CEST

This message: [ Message body ]
Next message: Mikito Harakiri: "Re: C# 3.0 with pseudorelational extensions"
Previous message: DBMS_Plumber: "Re: C# 3.0 with pseudorelational extensions"
Next in thread: vc: "Re: matrix encoding IS adjacency list"
Reply: Vadim Tropashko: "Re: matrix encoding IS adjacency list"
Reply: vc: "Re: matrix encoding IS adjacency list"
Reply: Vadim Tropashko: "Re: matrix encoding IS adjacency list"
Maybe reply: David Cressey: "Re: matrix encoding IS adjacency list"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message