Re: Implementing trees in a relational database

From: Mikito Harakiri <mikharakiri_at_yahoo.com>
Date: 1 Aug 2002 11:26:38 -0700
Message-ID: <bdf69bdf.0208011026.9caa07a_at_posting.google.com>


Although many were quick pointing to various tree labeling schemas, there is also a "declarative spirit" that is often overlooked. Indeed, a query

SELECT O2.*
   FROM OrgChart AS O1, OrgChart AS O2
  WHERE O1.lft BETWEEN O2.lft AND O2.rgt     AND O1.emp = :myemployee;

from Santoro&Khatib interval scheme includes details of labeling - physical attributes lft and rgt - into the query. The same applied to prefix based approach (aka materialized path) when the query includes substring predicates that are exposing physical labeling structure as well.

The query that is not dependent of labeling:

SELECT O2.*
   FROM OrgChart AS o1, OrgChart AS o2
  WHERE isDescendant(o1, o2)

Another example of "labeling independent" query is oracle "connect by" syntax:

select * from emp
connect by prior empno = mgr
start with empno = 1000

Yet another "labeling independent" syntax is SQL recursive "with" extension.

Ideally, tree structure definition in relational model shouldn't include any reference to the labeling scheme. Labeling is just yet another incremental evaluation structure that may speed up the query whenever the optimizer decides so. For example, adjucency graph table may be viewed as a tree structure while materialized closure table is an incremental evaluation structure upon it. One way to answer the query is to traverse links in the adjucency graph, while the other, probably, faster way is just selecting from materialized closure -- the decision must be cost based (as always;-).

"Paul DeWolf" <paul_at_thievesandkings.com> wrote in message news:<uLn19.179675$Wt3.134644_at_rwcrnsc53>...
> Hi all,
>
> I used to work in the application development area of a very large company
> who built applications on top of very complex data models using a very large
> relational database vendor.
>
> We had various networks (recursive many-to-many relationships) that were
> stored in the database through "connectivity" tables, but it was *extremely*
> awkward to develop the software both on the application and database sides
> and it seemed that we had real scalability and performance problems,
> especially as we wanted to add to the data model.
>
> Since then I've moved to working for an object-oriented database company
> where these relationships are quite easy to develop and scale easily, but
> I'm still curious about how one can implement trees and networks in an
> RDBMS.
>
> Can someone point me to information (books, white papers) on techniques for
> efficiently implementing trees and networks in an RDBMS? Has anyone done it

> themselves and showed that they can scale linearly?
>
> Paul DeWolf
> Systems Engineer
> Objectivity, Inc.
Received on Thu Aug 01 2002 - 20:26:38 CEST

Original text of this message