matrix encoding IS adjacency list
Date: 15 Sep 2005 10:49:23 -0700
Message-ID: <1126806563.166058.48220_at_o13g2000cwo.googlegroups.com>
As it has been written elsewhere, Nested Intervals gradually evolved into matrix encoding. To summarize, each node of a tree is encoded with 4 integers, which translates into the following schema design:
table MatrixTreeNodes (
a11 integer, a12 integer, a21 integer,
a22 integer
);
As the title of the message says, there is adjacency list relation
hidden there! Let see, the n-th child
[[a11_child,a12_child][a21_child,a22_child]] of the node
[[a11_parent,a12_parent][a21_parent,a22_parent]] is calculated by the
formulas
a11_child = a11_parent * n + a12_parent a12_child = a11_parent a21_child = a21_parent * n + a22_parent a22_child = a21_parent
Therefore we could enhance the schema design with constraints
table MatrixTreeNodes (
a11 integer, a12 integer, a21 integer,
a22 integer
);
alter table MatrixTreeNodes
ADD CONSTRAINT fk_adjacency FOREIGN KEY (a12,a22)
REFERENCES jobs (a11,a21) ON DELETE ...; ADD CONSTRAINT fk_determinant CHECK ( abs(a11*a22-a21*a12)=1 );
Queries: ^^^^^^^
- In principle, ancestors chain could be queried symmetrically to the descendants. For performance reasons we take a totally different approach and split the problem into two parts: computing all the node encodings in the chain first, and extracting all the nodes by the keys from the database, second.
For a child node [[a11_child,a12_child][a21_child,a22_child]] we calculate parent encoding as
a11_parent = mod(a11_child,a12_child) a11_parent = a12_child a21_parent = mod(a21_child,a22_child) a21_parent = a22_child
and continue navigating the chain of nodes till the root (when a22=0).
For example, given the node [[49, 9], [38, 7]], we find its parent [[9, 4], [7, 3]], and grandfather [[4, 1], [3, 1]] and grandgrandfather [[1, 1], [1, 0]], which happens to be the root. We collect all those nodesinto a temporary table Ancestors and query the table as
select n.*
from MatrixTreeNodes n, Ancestors a
where n.a11=a.a11 and n.a12=a.a12 and n.a21=a.a21 and n.a22=a.a22
2. Finding descendants is done via Nested Intervals. There are about 5 pages of detailed explanations how to do that in the book I'm writing, but here is a cookbook recipy:
i) Enhance the schema with function based indexes:
CREATE INDEX idx_left ON MatrixTreeNodes(
case when a11*a22 < a12*a21 then
a11/a21
else
(a11+a12)/(a21+a22)
end
);
CREATE INDEX idx_right ON MatrixTreeNodes(
case when a11*a22 > a12*a21 then
a11/a21
else
(a11+a12)/(a21+a22)
end
);
ii) Query descendants with
select descendant.*
from (
select a11 b11, a12 b12, a21 b21, a22 b22
from MatrixTreeNodes
) /*descendant B*/, (
select a11, a12, a21, a22,
a11*a22 - a12*a21 det, case when a11*a22 < a12*a21 then a11/a21 else (a11+a12)/(a21+a22) end leftEnd, case when a11*a22 > a12*a21 then a11/a21 else (a11+a12)/(a21+a22) end rightEnd
from MatrixTreeNodes
) /*ancestor A*/
where a11*b11*det > a12*b21*det -- X11 > 0 and a22*b12*det > a12*b22*det -- X12 > 0 and a11*b21*det > a21*b11*det -- X21 > 0 and a11*b22*det >= a21*b12*det -- X22 >= 0and leftEnd-0.0000001 < a11/a21
and leftEnd-0.0000001 < (a11+a12)/(a21+a22) and a11/a21 < rightEnd+0.0000001
and (a11+a12)/(a21+a22) < rightEnd+0.0000001 and node.name = ... -- predicate uniquely identifying a node
The constant 0.0000001 is designed to compensate for floating point arithmetic rounding errors. It essentially is a minimal supported mantissa. Please refer to your favorite database SQL manual in order to find out the exact value. This way index range scan would capture all the nodes in the interval and, possibly, some more . Then, the (small) list of nodes is filtered with the exact condition. Received on Thu Sep 15 2005 - 19:49:23 CEST