Re: Other tree representations in SQL

From: Carl Rosenberger <carl_at_db4o.com>
Date: Sat, 7 Dec 2002 23:10:20 +0100
Message-ID: <astr43$2bp$02$1_at_news.t-online.com>


--CELKO-- wrote:
> 1) I only write SQL books.

Oooops, I guess I must have mistaken you for this man: http://www.byte.com/art/9710/sec6/art6.htm

Three comments:
(1) Object database vendors should work on supporting SQL, to allow their products to be run with common reporting tools.

(2) All object databases products with noticeable market share between 1980 and 2000 were transparent persistency systems. Transparent persistency can only work in concurrent-multi-user environments, if a "read back in time" isolation level is provided. As far as I am informed, none of the available "best selling" products has this feature. Instead they still work with read locks to ensure consistent objects.
...so they are still not ready for prime-time today.

(3) Object queries are not difficult.
Query-By-Example is the most natural way of querying. "Give me all objects that look like this one!"

> 1) I only write SQL books.
> A dirty job, but someone has to do it <g>

I used to work with SQL. I was young and I needed the money. ;-)

We will continue with S.O.D.A. to make it suitable for an SQL layer on top.

> 2) IMS will fly thru huge amounts of heirarchical data faster than
> either SQL or an OODBMS.

I think this statement is too generalised. Fly doing what? How much work is it, to get the system to fly nice loopings?

> I don't know Java, so I am at a bit of a loss to evaluate the code.

Come on, I guess you do know at least one object-oriented language like C++ ?

At least I would hope so, if you write articles on object databases? (Your 1997 article is an excellent read anyway.)

The code that I posted contains just one reference to a Java JDK method: File#delete()

The rest of it is plain OO code, using very few methods from this API:
http://www.db4o.com/db4o/doc/

I hope the answer I posted to Alfredo explains the code a little better.

> In SQL, the tree structures are not used for searching or traditional
> traversals. The nodes have a key, and if you want to find a node, you
> just do a "SELECT .. FROM Tree WHERE node_key = :my_guy;". We don't
> ask for the path (i.e. traditional ordered traversal), we ask for the
> set of descendents or ancestors of a node.

The reason is fairly simple:
As you have shown, it's difficult to traverse trees in SQL. If you need to do that, you would usually load a complete representation of the tree into another language. The loading process would not benefit from a tree structure on the SQL side, so why bother to create one in the first place.

> >> With an object database the task is dead simple:
> - load the tree into memory from the root
> - balance the tree
> - store the root <<
>
> Come on, you know that is not an answer! Remember the famous cartoon
> of two sciencists in front of a blackboard and in the middle of the
> equation there is a block of text that reads "then a miracle occurs"?
> The caption is "You need to be more explicit about step #2."

So far there was no sign that anyone could successfully read my code, but I can try again.

A little bit of explanation in advance:

"ScTree" is a tree algorithm that was developed for db4o. Balancing takes place by comparing the number of subnodes of every branch. This has proved to construct trees, balanced "well enough" and overall performance empirically turned out to be just as good as with red-black-trees and nearly as good as with AVL trees. The advantages of this tree construct:

  • The number of elements of every branch is automatically maintained and always available.
  • Every branch is a consistent tree on its own.
  • The tree can be nicely constructed from the leaves. This allows reading the tree from a hard drive, without having to perform compare and balance operations. The tree code was a little bit modified for this posting, to be able to produce unbalanced trees. ScTree#reBalance() is usually not needed, so it's just a quickhack added for this example.

"ScTreeInt" is a payload for integers for ScTree.

The "TreeBalanceDemo" example stores an unbalanced tree, reads the tree in again, balances it and stores it again, with one single line of code for storing the tree.

The balance of the tree is checked and written to System.out by printing the depth of all nodes that have no children.

Here is the output of running "TreeBalanceDemo":

Tree balance:
 20
Tree balance:
 4 5 5 5 5 5 5 4 5 5

The complete runnable code is attached below.

Kind regards,
Carl

import java.io.*;

import com.db4o.*;

public class TreeBalanceDemo {

    public static boolean doBalance;

    ScTree root;

    static final String DATABASE_FILE = "treeBalanceDemo.yap";

    public static void main(String[] args) {

        new File(DATABASE_FILE).delete();
        Db4o.configure().objectClass("ScTree").cascadeOnUpdate(true);
        Db4o.configure().objectClass("TreeBalanceDemo").cascadeOnUpdate(true);

        storeNewUnbalancedTree();
        logBalance();
        balance();
        logBalance();

    }

    static void storeNewUnbalancedTree() {

        doBalance = false;
        ObjectContainer objectContainer = Db4o.openFile(DATABASE_FILE);
        TreeBalanceDemo tbd = new TreeBalanceDemo();
        tbd.root = new ScTreeInt(0);
        for (int i = 1; i < 20; i++) {
            tbd.root = tbd.root.add(new ScTreeInt(i));
        }
        objectContainer.set(tbd);
        objectContainer.close();

    }

    static void balance() {

        doBalance = true;
        ObjectContainer objectContainer = Db4o.openFile(DATABASE_FILE);
        ObjectSet objectSet = objectContainer.get(new TreeBalanceDemo());
        TreeBalanceDemo tbd = (TreeBalanceDemo) objectSet.next();
        objectContainer.activate(tbd, Integer.MAX_VALUE);
        tbd.root = tbd.root.reBalance(); // the quickhack reBalance is not
        tbd.root = tbd.root.reBalance(); // too good but repeating it does
        tbd.root = tbd.root.reBalance(); // the trick :-)

        // This is the ONE essential line of this demo
        // ONE single line of code stores the complete
        // tree and all it's nodes.
        objectContainer.set(tbd);

        objectContainer.close();

    }

    static void logBalance() {

        ObjectContainer objectContainer = Db4o.openFile(DATABASE_FILE);
        ObjectSet objectSet = objectContainer.get(new TreeBalanceDemo());
        TreeBalanceDemo tbd = (TreeBalanceDemo) objectSet.next();
        objectContainer.activate(tbd, Integer.MAX_VALUE);
        tbd.root.logBalance();
        objectContainer.close();

    }
}

import java.util.*;

/*
 * (c) Copyright 2002 http://www.db4o.com
 * Author: Carl Rosenberger
 * All Rights Reserved.
 *
 *
 * SubnodeCountTree
 *
 *
 * Principle:
 * Every node knows the number of subnodes.
 * Nodes are balanced on insert and removal, if the difference
 * between the number of subsequent and preceding nodes exceeds 2.
 *
 *
 * Advantages against other tree algorithms:
 * - The number of elements in the tree is automatically maintained.
 * - Every branch also is a consistent tree that knows the number of elements.
 * - The tree can be nicely constructed from the leaves. This allows reading
 * the tree from a hard drive, without having to perform compare and balance.
 *
 */

abstract class ScTree {

    private ScTree i_preceding;
    private ScTree i_subsequent;
    private int i_size = 1;

    final ScTree add(ScTree a_new) {

        int cmp = compare(a_new);
        if (cmp < 0) {
            if (i_subsequent == null) {
                i_subsequent = a_new;
                i_size++;
                return this;
            } else {
                i_subsequent = i_subsequent.add(a_new);
                if (i_preceding == null) {
                    return rotateLeft();
                } else {
                    return balance();
                }
            }
        } else if (cmp > 0 || a_new.duplicates()) {
            if (i_preceding == null) {
                i_preceding = a_new;
                i_size++;
                return this;
            } else {
                i_preceding = i_preceding.add(a_new);
                if (i_subsequent == null) {
                    return rotateRight();
                } else {
                    return balance();
                }
            }
        } else {
            return this;
        }

    }

    private boolean disallowBalance() {

        if (!TreeBalanceDemo.doBalance) {
            calculateSize();
            return true;
        }
        return false;

    }

    private final ScTree balance() {

        if (disallowBalance()) {
            return this;
        }
        int cmp = i_subsequent.i_size - i_preceding.i_size;
        if (cmp < -2) {
            return rotateRight();
        } else if (cmp > 2) {
            return rotateLeft();
        } else {
            i_size = i_preceding.i_size + i_subsequent.i_size + 1;
            return this;
        }

    }

    private final void calculateSize() {

        if (i_preceding == null) {
            if (i_subsequent == null) {
                i_size = 1;
            } else {
                i_size = i_subsequent.i_size + 1;
            }
        } else {
            if (i_subsequent == null) {
                i_size = i_preceding.i_size + 1;
            } else {
                i_size = i_preceding.i_size + i_subsequent.i_size + 1;
            }
        }

    }

    abstract int compare(ScTree a_to);

    abstract boolean duplicates();

    protected final ScTree find(ScTree a_tree) {

        int cmp = compare(a_tree);
        if (cmp < 0) {
            if (i_subsequent != null) {
                return i_subsequent.find(a_tree);
            }
            return null;
        } else {
            if (cmp > 0) {
                if (i_preceding != null) {
                    return i_preceding.find(a_tree);
                }
                return null;
            } else {
                return this;
            }
        }

    }

    public void logBalance() {

        System.out.println("\nTree balance:");
        logBalance(this, 1);

    }

    protected void logBalance(ScTree a_tree, int depth) {

        if (a_tree != null) {
            if (a_tree.i_preceding == null && a_tree.i_subsequent == null) {
                System.out.print(" " + depth + " ");
            }
            depth++;
            logBalance(a_tree.i_preceding, depth);
            logBalance(a_tree.i_subsequent, depth);
        }

    }

    public ScTree reBalance() {

        if (i_preceding != null) {
            i_preceding = i_preceding.reBalance();
        }
        if (i_subsequent != null) {
            i_subsequent = i_subsequent.reBalance();
        }
        ScTree node = this;
        if (i_preceding == null) {
            if (i_subsequent != null && i_subsequent.i_size > 1) {
                node = rotateLeft();
            }
        } else if (i_preceding.i_size > 1) {
            node = rotateRight();
        }
        List list = new ArrayList();
        while (!list.contains(node)) {
            list.add(node);
            node = node.reBalance1();
        }
        return node;

    }

    public ScTree reBalance1() {

        if (i_subsequent != null && i_preceding != null) {
            return balance();
        }
        return this;

    }

    protected ScTree remove() {

        if (i_subsequent != null && i_preceding != null) {
            i_subsequent = i_subsequent.rotateSmallestUp();
            i_subsequent.i_preceding = i_preceding;
            i_subsequent.calculateSize();
            return i_subsequent;
        }
        if (i_subsequent != null) {
            return i_subsequent;
        }
        return i_preceding;

    }

    final ScTree removeLike(final ScTree a_find) {

        int cmp = compare(a_find);
        if (cmp == 0) {
            return remove();
        }
        if (cmp > 0) {
            if (i_preceding != null) {
                i_preceding = i_preceding.removeLike(a_find);
            }
        } else {
            if (i_subsequent != null) {
                i_subsequent = i_subsequent.removeLike(a_find);
            }
        }
        calculateSize();
        return this;

    }

    final ScTree removeNode(final ScTree a_tree) {

        if (this == a_tree) {
            return remove();
        }
        int cmp = compare(a_tree);
        if (cmp >= 0) {
            if (i_preceding != null) {
                i_preceding = i_preceding.removeNode(a_tree);
            }
        }
        if (cmp <= 0) {
            if (i_subsequent != null) {
                i_subsequent = i_subsequent.removeNode(a_tree);
            }
        }
        calculateSize();
        return this;

    }

    private final ScTree rotateLeft() {

        if (disallowBalance()) {
            i_subsequent.calculateSize();
            calculateSize();
            return this;
        }
        ScTree tree = i_subsequent;
        i_subsequent = tree.i_preceding;
        calculateSize();
        tree.i_preceding = this;
        tree.calculateSize();
        return tree;

    }

    private final ScTree rotateRight() {

        if (disallowBalance()) {
            i_preceding.calculateSize();
            calculateSize();
            return this;
        }
        ScTree tree = i_preceding;
        i_preceding = tree.i_subsequent;
        calculateSize();
        tree.i_subsequent = this;
        tree.calculateSize();
        return tree;

    }

    private final ScTree rotateSmallestUp() {

        if (i_preceding != null) {
            i_preceding = i_preceding.rotateSmallestUp();
            return rotateRight();
        }
        return this;

    }

    public int size() {

        return i_size;
    }

    final void traverse(Visitor a_visitor) {

        if (i_preceding != null) {
            i_preceding.traverse(a_visitor);
        }
        a_visitor.visit(this);
        if (i_subsequent != null) {
            i_subsequent.traverse(a_visitor);
        }

    }

    void visit(Object a_object) {

        // virtual
        // overridden in QCandidate

    }

}

/*
 * (c) Copyright 2002 http://www.db4o.com
 * Author: Carl Rosenberger
 * All Rights Reserved.
 *
 *
 * Payload for the ScTree to carry int.
 *
 */

public final class ScTreeInt extends ScTree {

    private int i_key;

    public ScTreeInt(int a_key) {

        this.i_key = a_key;
    }

    final int compare(ScTree a_to) {

        return i_key - ((ScTreeInt) a_to).i_key;     }

    final boolean duplicates() {

        return false;
    }
} Received on Sat Dec 07 2002 - 23:10:20 CET

Original text of this message