Re: Tree structures in an RDBMS

From: Kendall <kendallwillets_at_yahoo.com>
Date: Thu, 29 Nov 2001 17:54:23 -0800
Message-ID: <pan.2001.11.29.17.54.23.349.2276_at_yahoo.com>


In article <3C0535FC.9030807_at_syneticon.de>, "Denis Jedig" <dj_at_syneticon.de> wrote:

> So currently I am thinking about storing the tree structure in a table
> as a 2-tuple of references to the node-table just like in a linked list:
>
> CREATE TABLE Nodes (NodeNr integer not null, NodeName varchar); CREATE
> TABLE NodeTree (ParentNodeNr int, ChildNodeNr int);
>
>

This approach is really only required if a node can have multiple parents.  If not, I see none of the benefits you cite - updateability for one - in this representation versus simply giving each node a ParentNodeNr column.

If only one parent per node is required, it should be somewhat easier to manage - Joe C's node numbering approach can be used, for one.

> May be that I understood something wrong or missed some points in my
> considerations about that approach - would be glad to hear something on
> that. BTW, I read in articles dating 1999 and prior, that SQL3 would
> support recursive querying as well as some extensions of commercial SQL
> server products (Oracle) do. Is there a recommended "further reading" on
> that and something like a list of RDBMS that allow such a query?
>
>
DB2 has the SQL3 recursive feature. Oracle has CONNECT BY. Both suffer from some artificial limitations, for instance Oracle can't handle a loop in the structure, something that one might want to use a CONNECT BY query to find.

> Another way to query that came in my mind would be to CREATE TABLE
> temp_table (NodeNr int); INSERT INTO temp_table (NodeNr) VALUES (X);
> ^
> this X being a starting point
> from where to "unfold" the tree
> and then sequentially to
> INSERT INTO temp_table (NodeNr) SELECT NT.ChildNodeNr FROM NodeTree NT,
> temp_table TT WHERE TT.NodeNr=NT.ParentNodeNr AND NOT EXISTS (SELECT *
> FROM temp_table TT_ex WHERE TT_ex.NodeNr=NT.ChildNodeNr) until the
> number of rows affected is 0. I expect this solution to perform even
> worse than the prior one, but at least it does not limit my query depth.
>
>
This is an approach I've used for bulk extractions. It's fast in many situations, but kind of depressing to have to run repeatedly for each query.

> The tree is supposed to have about 15 levels with a mean of 3 child
> nodes arising from each parent. This lets me end up at the calculation
> of Sum(k=1;15;3^k)=21*10^6 nodes to store in my representation. Most
> queries will be directed "forward", i.e. from a specific node X down to
> the leaves, and will typically start 3 levels above the mean leaf level
> for that branch.
> The database is expected to be written for PostgreSQL.
>
>
Any average depth figures? Maybe they don't want to populate down to all 15 levels.

Also, how often are you updating (insert, move, etc.) the tree vs. querying it? You're pretty much stuck with some kind of extra processing on certain operations, so the approach you need depends on the transaction mix. Received on Fri Nov 30 2001 - 02:54:23 CET

Original text of this message