Other tree representations in SQL

From: --CELKO-- <71062.1056_at_compuserve.com>
Date: 3 Dec 2002 15:27:37 -0800
Message-ID: <c0d87ec0.0212031527.7ca335bf_at_posting.google.com>

Vadim Tropashko suggested that we could represent a tree structure in SQL by storing a traversal sequence number and the level of each node to represent a hierarchy. This would be an alternative to my nested sets model

Tree Traversals are usually associated with binary trees in the literature because the three types of traversals can be written in very neat, clean recursive procedures. But they can be generalized to any tree by moving from the leftmost to the rightmost subtree. There are three ways to traverse a tree -- preorder, inorder and postorder.

    A
   / \
  B C
/ \ / \
D E F G

(1) Inorder traversal

  Step 1: Traverse the left subtree of the root if it exists.
  Step 2: Visit the root.
  Step 3: Traverse the right subtree of the root if it exists.

That would give you (D, B, E, A, F, C, G) as the sequence in the sample tree.

(2) Preorder traversal

  Step 1: Visit the root.
  Step 2: Traverse the left subtree of the root if it exists.
  Step 3: Traverse the right subtree of the root if it exists.

That would give you (A, B, D, E, C, F, G) as the sequence in the sample tree.

(3) Postorder traversal

  Step 1: Traverse the left subtree of the root if it exists.
  Step 2: Traverse the right subtree of the root if it exists.
  Step 3: Visit the root.

That would give you (D, E, B, F, G, C, A) as the sequence in the sample tree.

Here is a table based of a non-binary tree, numbered with the level and the preorder traversal sequence number of each node.

INSERT INTO Tree VALUES ('a', 1, 1);
INSERT INTO Tree VALUES ('b', 2, 2);
INSERT INTO Tree VALUES ('c', 3, 2);
INSERT INTO Tree VALUES ('d', 4, 3);
INSERT INTO Tree VALUES ('e', 5, 3);
INSERT INTO Tree VALUES ('f', 6, 3);
INSERT INTO Tree VALUES ('g', 7, 2);
INSERT INTO Tree VALUES ('h', 8, 3);
INSERT INTO Tree VALUES ('i', 9, 3);

This does give us a unique co-ordinate pair to identify a node, and I played with it a little bit.

Queries which deal with levels in the hierarchy are quite easy, but those which deal with aggregations up and down the hierarchy can be complex. To establish the parent of a given child, you have to find the greatest sequence number that exists one level higher than the child node. The parent's sequence number will always be less than the child's sequence number.

SELECT *, :my_child
FROM Tree
WHERE seq

(SELECT MAX(T1.seq) AS parent FROM Tree AS T1,Tree AS T2 WHERE T1.seq < T2.seq AND T1.level = T2.level -1 AND T1.node = :my_child); 2) Finding a subtree rooted at a particular node is more complex. The leftmost child of a node is easy to find; it is the sequence number of the parent plus one. The problem is establishing a rightmost child in a subtree.

You know that all the children will have a greater sequence number than the root and a greater level number. But at the same time, you do not want to get the nodes in other subtrees that are siblings to the right of the root node.

The way you know that a sequence number belongs to another subtree is that while the sequence number increased, the level number decreased. For example, if we want to see the subtree rooted at ('c', 3, 2) in the sample table, we know that the subtree must have greater sequence number, thus:

('d', 4, 3)
('e', 5, 3)
('f', 6, 3)
('g', 7, 2) <==
('h', 8, 3)
('i', 9, 3)

Since 'c' is at level 2, we only need to look at nodes with a greater level number, thus

('d', 4, 3)
('e', 5, 3)
('f', 6, 3)
('h', 8, 3)
('i', 9, 3)

But this leaves 'h' and 'i', which are in another subtree under 'g' and clearly wrong. What we want to find is the rightmost sequence number in the target subtree. This number also happens to be the first sequence number that jumps back up to the same or a higher level as the root. Look at how 'g' is back on level 2.

But there is another problem. When you look at the rightmost subtree, there is no node to the right, so we have to use the MAX(seq) in the whole tree. This leads us to this query.

SELECT :my_root, T1.*
FROM Tree AS T1, Tree AS T2

 WHERE T2.node = :my_root
   AND T2.seq < T1.seq
   AND T1.level > T2.level
   AND T1.seq
       <= COALESCE((SELECT MIN(T3.seq)
                      FROM Tree AS T3
                     WHERE T3.level = T2.level
                       AND T3.seq > T2.seq),
                   (SELECT MAX(seq) FROM Tree));

The other two traversals will have slightly different queries to perform the same basic "find children" and "find parent" operations.

The first people to come up with the code and some explanation of how it works for inorder and postorder trees will get credit in my next book on trees and hierarchies in SQL. Not exactly as good as a beer, but some chance for fame and glory along with a free copy of the book when it is published. Received on Wed Dec 04 2002 - 00:27:37 CET

This message: [ Message body ]
Next message: holychapin_at_hotmail.com: "Re: Creating MY own database with C++ and of course WinAPI"
Previous message: skotske: "Re: Plea for help - Normalisation Problem"
Next in thread: Heinz Huber: "Re: Other tree representations in SQL"
Reply: Heinz Huber: "Re: Other tree representations in SQL"
Reply: Carl Rosenberger: "Re: Other tree representations in SQL"
Reply: Damjan S. Vujnovic: "Re: Other tree representations in SQL"
Maybe reply: --CELKO--: "Re: Other tree representations in SQL"
Reply: Greg Boland: "Re: Other tree representations in SQL"
Reply: Damjan S. Vujnovic: "Re: Other tree representations in SQL [inorder]"
Reply: Damjan S. Vujnovic: "Re: Other tree representations in SQL"
Maybe reply: Carl Rosenberger: "Re: Other tree representations in SQL"
Maybe reply: Carl Rosenberger: "Re: Other tree representations in SQL"
Reply: John Gilson: "Re: Other tree representations in SQL"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message