Re: Modelling large trees and hierarchies

From: Alan <not.me_at_uhuh.rcn.com>
Date: Thu, 17 Jun 2004 02:01:16 GMT
Message-ID: <MX6Ac.38449$Xw3.1039_at_nwrdny03.gnilink.net>


Joe is the expert on this. He can be found regularly on comp.databases.

"Paul Arnold" <paula_at_pivetal.com> wrote in message news:b6ee5aa3.0406161236.6b2e1205_at_posting.google.com...
> We are currently in the process of developing a client/server based
> generic tree application with which we want to be able to model and
> render any size, shape and depth of tree.
>
> Our primary objective is to provide the ability to model large data
> trees with some 20 million+ nodes whilst still allowing adequate
> rendering and query times from the client.
>
> We have just bought Joe Celko's "Tree's and Hierarchies in SQL fo
> Smarties" and have picked up some very useful tips.
>
> We are definitely looking at using some hybrid model but are undecided
> as to what combination would suit us best.
>
> Whilst constructing the tree we have the following information
> available to us, that we could/may include in our database structure:
>
> 1) The Node's parent.
> 2) The Node's depth/level.
> 3) The Node's path to the root node.
> 4) The Node's child nodes count.
> 5) The Node's descendant nodes count.
> 6) The Node's position relative to it's sibling nodes.
>
> This infomation would allow us to combine many of the models Joe Celko
> discusses, including:
>
> 1) The Nested Sets/Interval model - allows us to quickly find
> subtrees.
> 2) The Adjacency List model - allows us to quickly find immediate
> child nodes.
> 3) The Materialized Path model - allows us to quickly find the
> ancestors of the node.
> 4) The Depth Model - allows us to quickly find nodes related to their
> levels/depth.
>
> It is also quite likely that we will want to change the structure of
> the tree fairly frequently. As an indication, I can see update rates
> of every 15 minutes or less.
>
> To us, disk space and hardware are cheap, so we are not concerned with
> storage overhead. We just need optimal query performance.
>
> Types of queries we need are for example:
>
> Given a node...
> 1) Get it's immediate children.
> 2) Get it's ancestors.
> 3) Get all of it's descendants.
> 4) Get it's descendants to a certain depth.
> 5) Get it's siblings
> etc. etc.
>
> Any suggestions/foresight/tips that may help us in the database
> modelling would be most appreciated?
Received on Thu Jun 17 2004 - 04:01:16 CEST

Original text of this message