Tree structures in an RDBMS
Date: Wed, 28 Nov 2001 20:07:40 +0100
Message-ID: <3C0535FC.9030807_at_syneticon.de>
Hello Group,
The following is my current challenge:
I have a tree-ordered set of data (not necessarily balanced) and need to
store it in a relational database. The requirements for the relational
representation are:
a node X, I need to get all nodes and leaves that are below X, given
a node Y, I need all nodes in the path from the root down to Y.
or the number of tree levels, the tree must accept addition of new
nodes at any position
within the tree e.g. move a specific branch from level 4 below a node Y
to level 12 below node Z.
I had a solution in mind by myself and read several news articles, that
CREATE TABLE Nodes (NodeNr integer not null, NodeName varchar);
CREATE TABLE NodeTree (ParentNodeNr int, ChildNodeNr int);
That way my structure retains extendable - it has no inherent limits for
- ability for branch queries forward as well as backward: i.e. given
- extensibility: there must not be a hard limit on the number of nodes
- reorderability: from time to time branches will need to get reordered
So currently I am thinking about storing the tree structure in a table as a
2-tuple of references to the node-table just like in a linked list:
My problem is "unfolding" the tree from its relational representation in
either direction, thus retreiving all nodes below or above a node X. The
procedure must happen entirely in the backend.
A general approach would be a slight variation of an n-depth recursive
SELECT ChildNodeNr FROM NodeTree WHERE ParentNodeNr IN (SELECT ChildNodeNr
FROM NodeTree WHERE ParentNodeNr IN (...));
where I also would preserve the results of my subqueries and UNION them all
in the end, but I fear this solution to be far too costy in evaluation on
the one hand and a real limiting factor on the other- most DBMS would limit
the number of tables opened for a single query to something around 15 - so I
would end up with a practical query depth limit of 15 levels for my tree
(even worse, that value would depend on the DBMS used and on some DBMS this
limit might be exhausted earlier due to use of temporary tables in the query
process).
Another way to query that came in my mind would be to
CREATE TABLE temp_table (NodeNr int);
May be that I understood something wrong or missed some points in my
considerations about that approach - would be glad to hear something on
that. BTW, I read in articles dating 1999 and prior, that SQL3 would support
recursive querying as well as some extensions of commercial SQL server
products (Oracle) do. Is there a recommended "further reading" on that and
something like a list of RDBMS that allow such a query?
INSERT INTO temp_table (NodeNr) VALUES (X);
^
this X being a starting point
from where to "unfold" the tree
and then sequentially to
INSERT INTO temp_table (NodeNr) SELECT NT.ChildNodeNr FROM NodeTree NT,
temp_table TT WHERE TT.NodeNr=NT.ParentNodeNr AND NOT EXISTS (SELECT * FROM
temp_table TT_ex WHERE TT_ex.NodeNr=NT.ChildNodeNr)
until the number of rows affected is 0.
I expect this solution to perform even worse than the prior one, but at
least it does not limit my query depth.
I can't think of any better solution yet, so I am posting here today...
Some values estimated for the tree structure, its size and the main query character and the environment:
Any comments on that kind of implementation as well as any hints about how to address my problem are greatly appreciated. I suppose, the "object guys" sometimes need to resolve similar problems whenever they need to store their object tree in a relational DB, but I have not found a relevant implementation yet. Maybe the left-right indexed model is working fine for them, who knows.
BTW: I suppose I am somewhat off-topic here, but there is no comp.databases.relational for some reason.
TIA Denis Received on Wed Nov 28 2001 - 20:07:40 CET
