Re: Using Materialized Path to create a paths table

From: Matthew Brealey <thelawnet_at_gmail.com>
Date: 25 May 2006 07:39:23 -0700
Message-ID: <1148567963.805349.302750_at_j55g2000cwa.googlegroups.com>


David Cressey wrote:
> "Matthew Brealey" <thelawnet_at_gmail.com> wrote in message
> news:1148564515.778519.28270_at_y43g2000cwc.googlegroups.com...
> > I wonder if anyone has any thoughts on using the materialized path as
> > an intermediate step to create a kind of paths table to represent a
> > tree hierarchy
> >
> > Here are some records
> >
> > EmployeeId ParentEmployeeId MaterializedPath
> > 1 <null> 1.
> > 2 1 1.1.
> > 3 1 1.2.
> > 4 2 1.1.1
> >
> > etc.
> >
> > To find the descendants of a node using the materialized path is quite
> > low-cost: an index on MaterializedPath would be usable with the
> > following query
> >
> > select
> > Descendant.EmployeeId
> > from Employee as Ancestor
> > inner join Employee as Descendant on Descendant.MaterializedPath like
> > Ancestor.MaterializedPath + '%'
> > where Ancestor.EmployeeId = 1
> >
> > To find the ancestors however with the following query
> >
> > select
> > Ancestor.EmployeeId
> > from Employee as Descendant
> > inner join Employee as Ancestor on Descendant.MaterializedPath like
> > Ancestor.MaterializedPath + '%'
> > where Descendant.EmployeeId = 4
> >
> > requires a table scan.
> >
> > This is very expensive for any non-trivial number of nodes where you
> > need to find a node's ancestors.
> >
> > But if you treat the MaterializedPath as an intermediate result, it is
> > a trival to use it to build a denormalized EmployeeAncestor table,
> > which for the set of data above would be as follows
> >
> > EmployeeId AncestorEmployeeId
> > 1 1
> > 2 1
> > 2 2
> > 3 1
> > 3 2
> > 4 1
> > 4 2
> > 4 4
> >
> > I.e. there is one record for each of each node's ancestors (including
> > the node itself)
> >
> > This table is easy to maintain using the same trigger tha maintains the
> > MaterializedPath (although not so easy if you did not maintain the
> > MaterializedPath itself)
> >
> > The advantage is that it becomes very much cheaper to calculate the
> > ancestors of a node:
> >
> > select
> > Ancestor.EmployeeId
> > from
> > EmployeeAncestor Descendant
> > inner join Employee Ancestor on Ancestor.EmployeeId =
> > Descendant.AncestorEmployeeId
> > where Descendant.EmployeeId = 4
> >
> > The query to find descendant is similar:
> >
> > select
> > Descendant.EmployeeId
> > from
> > Employee Ancestor
> > inner join EmployeeAncestor Descendant on Descendant.AncestorEmployeeId
> > = Ancestor.EmployeeId
> > where Ancestor.EmployeeId = 1
> >
> > So it seems to me that the materialized path is not actually a good
> > thing to use to find ancestor/descendant relationships of itself, but
> > by using it to create a simple ancestor node-type table, you can get
> > slightly better performance for finding descendants, and vastly better
> > performance results for ancestors.
> >
> > Any comments?
> >
>
> I'd like to see this compared to the nested sets technique.

If your tree has a lot of inserts the nested sets left and right numbers can be expensive to maintain as you can end up having to re-walk the tree. For inserts, under this approach there is always NodeDepth rows inserted into the Ancestors table, which is extremely scaleable

In fact this is not really that related to the materialized path: in order to build the table, you need to walk up the tree to the ancestors, and the materialized path is really only a shortcut to find the ancestors. You could also get the Ancestors using a trigger and a cursor.

Using this table is very fast for querying, and for inserts the overhead is also small. Received on Thu May 25 2006 - 16:39:23 CEST

Original text of this message