Re: Other tree representations in SQL
Date: 16 Dec 2002 10:36:29 -0800
Message-ID: <909c94d1.0212161036.360bb2ff_at_posting.google.com>
71062.1056_at_compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0212031527.7ca335bf_at_posting.google.com>...
> Vadim Tropashko suggested that we could represent a tree structure in
> SQL by storing a traversal sequence number and the level of each node
> to represent a hierarchy. This would be an alternative to my nested
> sets model
>
> Tree Traversals are usually associated with binary trees in the
> literature because the three types of traversals can be written in
> very neat, clean recursive procedures. But they can be generalized to
> any tree by moving from the leftmost to the rightmost subtree. There
> are three ways to traverse a tree -- preorder, inorder and postorder.
>
> A
> / \
> B C
> / \ / \
> D E F G
>
> (1) Inorder traversal
> Step 1: Traverse the left subtree of the root if it exists.
> Step 2: Visit the root.
> Step 3: Traverse the right subtree of the root if it exists.
>
> That would give you (D, B, E, A, F, C, G) as the sequence in the
> sample tree.
>
> (2) Preorder traversal
> Step 1: Visit the root.
> Step 2: Traverse the left subtree of the root if it exists.
> Step 3: Traverse the right subtree of the root if it exists.
>
> That would give you (A, B, D, E, C, F, G) as the sequence in the
> sample tree.
>
> (3) Postorder traversal
> Step 1: Traverse the left subtree of the root if it exists.
> Step 2: Traverse the right subtree of the root if it exists.
> Step 3: Visit the root.
>
> That would give you (D, E, B, F, G, C, A) as the sequence in the
> sample tree.
>
> Here is a table based of a non-binary tree, numbered with the level
> and the preorder traversal sequence number of each node.
>
> INSERT INTO Tree VALUES ('a', 1, 1);
> INSERT INTO Tree VALUES ('b', 2, 2);
> INSERT INTO Tree VALUES ('c', 3, 2);
> INSERT INTO Tree VALUES ('d', 4, 3);
> INSERT INTO Tree VALUES ('e', 5, 3);
> INSERT INTO Tree VALUES ('f', 6, 3);
> INSERT INTO Tree VALUES ('g', 7, 2);
> INSERT INTO Tree VALUES ('h', 8, 3);
> INSERT INTO Tree VALUES ('i', 9, 3);
>
> This does give us a unique co-ordinate pair to identify a node, and I
> played with it a little bit.
>
> 1) Queries which deal with levels in the hierarchy are quite easy,
> but those which deal with aggregations up and down the hierarchy can
> be complex. To establish the parent of a given child, you have to
> find the greatest sequence number that exists one level higher than
> the child node. The parent's sequence number will always be less than
> the child's sequence number.
>
> SELECT *, :my_child
> FROM Tree
> WHERE seq
> = (SELECT MAX(T1.seq) AS parent
> FROM Tree AS T1,Tree AS T2
> WHERE T1.seq < T2.seq
> AND T1.level = T2.level -1
> AND T1.node = :my_child);
>
> 2) Finding a subtree rooted at a particular node is more complex.
> The leftmost child of a node is easy to find; it is the sequence
> number of the parent plus one. The problem is establishing a
> rightmost child in a subtree.
>
> You know that all the children will have a greater sequence number
> than the root and a greater level number. But at the same time, you
> do not want to get the nodes in other subtrees that are siblings to
> the right of the root node.
>
> The way you know that a sequence number belongs to another subtree is
> that while the sequence number increased, the level number decreased.
> For example, if we want to see the subtree rooted at ('c', 3, 2) in
> the sample table, we know that the subtree must have greater sequence
> number, thus:
>
> ('d', 4, 3)
> ('e', 5, 3)
> ('f', 6, 3)
> ('g', 7, 2) <==
> ('h', 8, 3)
> ('i', 9, 3)
>
> Since 'c' is at level 2, we only need to look at nodes with a greater
> level number, thus
>
> ('d', 4, 3)
> ('e', 5, 3)
> ('f', 6, 3)
> ('h', 8, 3)
> ('i', 9, 3)
>
> But this leaves 'h' and 'i', which are in another subtree under 'g'
> and clearly wrong. What we want to find is the rightmost sequence
> number in the target subtree. This number also happens to be the
> first sequence number that jumps back up to the same or a higher level
> as the root. Look at how 'g' is back on level 2.
>
> But there is another problem. When you look at the rightmost subtree,
> there is no node to the right, so we have to use the MAX(seq) in the
> whole tree. This leads us to this query.
>
> SELECT :my_root, T1.*
> FROM Tree AS T1, Tree AS T2
> WHERE T2.node = :my_root
> AND T2.seq < T1.seq
> AND T1.level > T2.level
> AND T1.seq
> <= COALESCE((SELECT MIN(T3.seq)
> FROM Tree AS T3
> WHERE T3.level = T2.level
> AND T3.seq > T2.seq),
> (SELECT MAX(seq) FROM Tree));
>
> The other two traversals will have slightly different queries to
> perform the same basic "find children" and "find parent" operations.
>
> The first people to come up with the code and some explanation of how
> it works for inorder and postorder trees will get credit in my next
> book on trees and hierarchies in SQL. Not exactly as good as a beer,
> but some chance for fame and glory along with a free copy of the book
> when it is published.
Inorder traversal of a n-ary tree doesn't have an intuitive definition
so I'll ignore that case.
For a given node N and preorder traversal, each ancestor's sequence
number is the greatest number on that level that is less than N's
sequence number.
For a given node N and postorder traversal, each ancestor's sequence
number is the least number on that level that is greater than N's
sequence number.
We can use these relationships directly to define the following views:
CREATE TABLE PreorderTree
(
node VARCHAR(10) NOT NULL PRIMARY KEY,
seq INT NOT NULL CHECK (seq > 0),
level INT NOT NULL CHECK (level > 0),
UNIQUE (level, seq)
)
CREATE VIEW PreorderRelationships
AS
SELECT T1.node AS descendant, T1.level AS descendant_level, T1.seq AS descendant_seq, T2.node AS ancestor, T2.level AS ancestor_level, T2.seq AS ancestor_seq FROM PreorderTree AS T1 INNER JOIN PreorderTree AS T2 ON T2.level < T1.level AND T2.seq < T1.seq LEFT OUTER JOIN PreorderTree AS T3 ON T3.level = T2.level AND T3.seq > T2.seq AND T3.seq < T1.seq
WHERE T3.seq IS NULL
CREATE TABLE PostorderTree
(
node VARCHAR(10) NOT NULL PRIMARY KEY,
seq INT NOT NULL CHECK (seq > 0),
level INT NOT NULL CHECK (level > 0),
UNIQUE (level, seq)
)
CREATE VIEW PostorderRelationships
AS
SELECT T1.node AS descendant, T1.level AS descendant_level, T1.seq AS descendant_seq, T2.node AS ancestor, T2.level AS ancestor_level, T2.seq AS ancestor_seq FROM PostorderTree AS T1 INNER JOIN PostorderTree AS T2 ON T2.level < T1.level AND T2.seq > T1.seq LEFT OUTER JOIN PostorderTree AS T3 ON T3.level = T2.level AND T3.seq < T2.seq AND T3.seq > T1.seq
WHERE T3.seq IS NULL
If we build the tree given in your posting in both preorder and postorder we have
- Preorder INSERT INTO PreorderTree VALUES ('a', 1, 1) INSERT INTO PreorderTree VALUES ('b', 2, 2) INSERT INTO PreorderTree VALUES ('c', 3, 2) INSERT INTO PreorderTree VALUES ('d', 4, 3) INSERT INTO PreorderTree VALUES ('e', 5, 3) INSERT INTO PreorderTree VALUES ('f', 6, 3) INSERT INTO PreorderTree VALUES ('g', 7, 2) INSERT INTO PreorderTree VALUES ('h', 8, 3) INSERT INTO PreorderTree VALUES ('i', 9, 3)
- Postorder INSERT INTO PostorderTree VALUES ('a', 9, 1) INSERT INTO PostorderTree VALUES ('b', 1, 2) INSERT INTO PostorderTree VALUES ('c', 5, 2) INSERT INTO PostorderTree VALUES ('d', 2, 3) INSERT INTO PostorderTree VALUES ('e', 3, 3) INSERT INTO PostorderTree VALUES ('f', 4, 3) INSERT INTO PostorderTree VALUES ('g', 8, 2) INSERT INTO PostorderTree VALUES ('h', 6, 3) INSERT INTO PostorderTree VALUES ('i', 7, 3)
We can then easily write some queries
- Using preorder, get all ancestors of E SELECT * FROM PreorderRelationships WHERE descendant = 'E'
returns
descendant descendant_level descendant_seq ancestor ancestor_level ancestor_seq
e 3 5 a 1 1 e 3 5 c 2 3
- Using postorder, get all descendants of C SELECT * FROM PostorderRelationships WHERE ancestor = 'C'
returns
descendant descendant_level descendant_seq ancestor ancestor_level ancestor_seq
d 3 2 c 2 5 e 3 3 c 2 5 f 3 4 c 2 5
Regards,
jag
Received on Mon Dec 16 2002 - 19:36:29 CET