Re: Other tree representations in SQL

From: John Gilson <jag_at_acm.org>
Date: 16 Dec 2002 10:36:29 -0800
Message-ID: <909c94d1.0212161036.360bb2ff_at_posting.google.com>

71062.1056_at_compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0212031527.7ca335bf_at_posting.google.com>...
> Vadim Tropashko suggested that we could represent a tree structure in
> SQL by storing a traversal sequence number and the level of each node
> to represent a hierarchy. This would be an alternative to my nested
> sets model
>
> Tree Traversals are usually associated with binary trees in the
> literature because the three types of traversals can be written in
> very neat, clean recursive procedures. But they can be generalized to
> any tree by moving from the leftmost to the rightmost subtree. There
> are three ways to traverse a tree -- preorder, inorder and postorder.
>
> A
> / \
> B C
> / \ / \
> D E F G
>
> (1) Inorder traversal
> Step 1: Traverse the left subtree of the root if it exists.
> Step 2: Visit the root.
> Step 3: Traverse the right subtree of the root if it exists.
>
> That would give you (D, B, E, A, F, C, G) as the sequence in the
> sample tree.
>
> (2) Preorder traversal
> Step 1: Visit the root.
> Step 2: Traverse the left subtree of the root if it exists.
> Step 3: Traverse the right subtree of the root if it exists.
>
> That would give you (A, B, D, E, C, F, G) as the sequence in the
> sample tree.
>
> (3) Postorder traversal
> Step 1: Traverse the left subtree of the root if it exists.
> Step 2: Traverse the right subtree of the root if it exists.
> Step 3: Visit the root.
>
> That would give you (D, E, B, F, G, C, A) as the sequence in the
> sample tree.
>
> Here is a table based of a non-binary tree, numbered with the level
> and the preorder traversal sequence number of each node.
>
> INSERT INTO Tree VALUES ('a', 1, 1);
> INSERT INTO Tree VALUES ('b', 2, 2);
> INSERT INTO Tree VALUES ('c', 3, 2);
> INSERT INTO Tree VALUES ('d', 4, 3);
> INSERT INTO Tree VALUES ('e', 5, 3);
> INSERT INTO Tree VALUES ('f', 6, 3);
> INSERT INTO Tree VALUES ('g', 7, 2);
> INSERT INTO Tree VALUES ('h', 8, 3);
> INSERT INTO Tree VALUES ('i', 9, 3);
>
> This does give us a unique co-ordinate pair to identify a node, and I
> played with it a little bit.
>
> 1) Queries which deal with levels in the hierarchy are quite easy,
> but those which deal with aggregations up and down the hierarchy can
> be complex. To establish the parent of a given child, you have to
> find the greatest sequence number that exists one level higher than
> the child node. The parent's sequence number will always be less than
> the child's sequence number.
>
> SELECT *, :my_child
> FROM Tree
> WHERE seq
> = (SELECT MAX(T1.seq) AS parent
> FROM Tree AS T1,Tree AS T2
> WHERE T1.seq < T2.seq
> AND T1.level = T2.level -1
> AND T1.node = :my_child);
>
> 2) Finding a subtree rooted at a particular node is more complex.
> The leftmost child of a node is easy to find; it is the sequence
> number of the parent plus one. The problem is establishing a
> rightmost child in a subtree.
>
> You know that all the children will have a greater sequence number
> than the root and a greater level number. But at the same time, you
> do not want to get the nodes in other subtrees that are siblings to
> the right of the root node.
>
> The way you know that a sequence number belongs to another subtree is
> that while the sequence number increased, the level number decreased.
> For example, if we want to see the subtree rooted at ('c', 3, 2) in
> the sample table, we know that the subtree must have greater sequence
> number, thus:
>
> ('d', 4, 3)
> ('e', 5, 3)
> ('f', 6, 3)
> ('g', 7, 2) <==
> ('h', 8, 3)
> ('i', 9, 3)
>
> Since 'c' is at level 2, we only need to look at nodes with a greater
> level number, thus
>
> ('d', 4, 3)
> ('e', 5, 3)
> ('f', 6, 3)
> ('h', 8, 3)
> ('i', 9, 3)
>
> But this leaves 'h' and 'i', which are in another subtree under 'g'
> and clearly wrong. What we want to find is the rightmost sequence
> number in the target subtree. This number also happens to be the
> first sequence number that jumps back up to the same or a higher level
> as the root. Look at how 'g' is back on level 2.
>
> But there is another problem. When you look at the rightmost subtree,
> there is no node to the right, so we have to use the MAX(seq) in the
> whole tree. This leads us to this query.
>
> SELECT :my_root, T1.*
> FROM Tree AS T1, Tree AS T2
> WHERE T2.node = :my_root
> AND T2.seq < T1.seq
> AND T1.level > T2.level
> AND T1.seq
> <= COALESCE((SELECT MIN(T3.seq)
> FROM Tree AS T3
> WHERE T3.level = T2.level
> AND T3.seq > T2.seq),
> (SELECT MAX(seq) FROM Tree));
>
> The other two traversals will have slightly different queries to
> perform the same basic "find children" and "find parent" operations.
>
> The first people to come up with the code and some explanation of how
> it works for inorder and postorder trees will get credit in my next
> book on trees and hierarchies in SQL. Not exactly as good as a beer,
> but some chance for fame and glory along with a free copy of the book
> when it is published.

Inorder traversal of a n-ary tree doesn't have an intuitive definition so I'll ignore that case.
For a given node N and preorder traversal, each ancestor's sequence number is the greatest number on that level that is less than N's sequence number.
For a given node N and postorder traversal, each ancestor's sequence number is the least number on that level that is greater than N's sequence number.

We can use these relationships directly to define the following views:

CREATE TABLE PreorderTree
(
node VARCHAR(10) NOT NULL PRIMARY KEY,
seq INT NOT NULL CHECK (seq > 0),
level INT NOT NULL CHECK (level > 0),
UNIQUE (level, seq)
)

CREATE VIEW PreorderRelationships
AS

SELECT T1.node AS descendant,
       T1.level AS descendant_level,
       T1.seq AS descendant_seq,
       T2.node AS ancestor,
       T2.level AS ancestor_level,
       T2.seq AS ancestor_seq
FROM PreorderTree AS T1
     INNER JOIN
     PreorderTree AS T2
     ON T2.level < T1.level AND
        T2.seq < T1.seq
     LEFT OUTER JOIN
     PreorderTree AS T3
     ON T3.level = T2.level AND
        T3.seq > T2.seq AND
        T3.seq < T1.seq

WHERE T3.seq IS NULL

CREATE TABLE PostorderTree
(
node VARCHAR(10) NOT NULL PRIMARY KEY,
seq INT NOT NULL CHECK (seq > 0),
level INT NOT NULL CHECK (level > 0),
UNIQUE (level, seq)
)

CREATE VIEW PostorderRelationships
AS

SELECT T1.node AS descendant,
       T1.level AS descendant_level,
       T1.seq AS descendant_seq,
       T2.node AS ancestor,
       T2.level AS ancestor_level,
       T2.seq AS ancestor_seq
FROM PostorderTree AS T1
     INNER JOIN
     PostorderTree AS T2
     ON T2.level < T1.level AND
        T2.seq > T1.seq
     LEFT OUTER JOIN
     PostorderTree AS T3
     ON T3.level = T2.level AND
        T3.seq < T2.seq AND
        T3.seq > T1.seq

WHERE T3.seq IS NULL

If we build the tree given in your posting in both preorder and postorder we have

Preorder INSERT INTO PreorderTree VALUES ('a', 1, 1) INSERT INTO PreorderTree VALUES ('b', 2, 2) INSERT INTO PreorderTree VALUES ('c', 3, 2) INSERT INTO PreorderTree VALUES ('d', 4, 3) INSERT INTO PreorderTree VALUES ('e', 5, 3) INSERT INTO PreorderTree VALUES ('f', 6, 3) INSERT INTO PreorderTree VALUES ('g', 7, 2) INSERT INTO PreorderTree VALUES ('h', 8, 3) INSERT INTO PreorderTree VALUES ('i', 9, 3)
Postorder INSERT INTO PostorderTree VALUES ('a', 9, 1) INSERT INTO PostorderTree VALUES ('b', 1, 2) INSERT INTO PostorderTree VALUES ('c', 5, 2) INSERT INTO PostorderTree VALUES ('d', 2, 3) INSERT INTO PostorderTree VALUES ('e', 3, 3) INSERT INTO PostorderTree VALUES ('f', 4, 3) INSERT INTO PostorderTree VALUES ('g', 8, 2) INSERT INTO PostorderTree VALUES ('h', 6, 3) INSERT INTO PostorderTree VALUES ('i', 7, 3)

We can then easily write some queries

Using preorder, get all ancestors of E SELECT * FROM PreorderRelationships WHERE descendant = 'E'

returns

descendant descendant_level descendant_seq ancestor ancestor_level ancestor_seq

e	3	5	a	1	1
e	3	5	c	2	3

Using postorder, get all descendants of C SELECT * FROM PostorderRelationships WHERE ancestor = 'C'

returns

descendant descendant_level descendant_seq ancestor ancestor_level ancestor_seq

d	3	2	c	2	5
e	3	3	c	2	5
f	3	4	c	2	5

Regards,
jag Received on Mon Dec 16 2002 - 19:36:29 CET

This message: [ Message body ]
Next message: David Cressey: "Check List Comparisons"
Previous message: Roy Hann: "Re: QUESTION: List array, graph or network model support DB"
In reply to: --CELKO--: "Other tree representations in SQL"
Next in thread: Joe \: "Re: Parts explosion with repeated subtrees"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message