Re: Using Materialized Path to create a paths table

From: Matthew Brealey <thelawnet_at_gmail.com>
Date: 25 May 2006 07:15:45 -0700
Message-ID: <1148566545.223764.127410_at_i39g2000cwa.googlegroups.com>


Bob Badour wrote:
> Matthew Brealey wrote:
> > I wonder if anyone has any thoughts on using the materialized path as
> > an intermediate step to create a kind of paths table to represent a
> > tree hierarchy
> >
> > Here are some records
> >
> > EmployeeId ParentEmployeeId MaterializedPath
> > 1 <null> 1.
> > 2 1 1.1.
> > 3 1 1.2.
> > 4 2 1.1.1
> >
> > etc.
> >
> > To find the descendants of a node using the materialized path is quite
> > low-cost: an index on MaterializedPath would be usable with the
> > following query
> >
> > select
> > Descendant.EmployeeId
> > from Employee as Ancestor
> > inner join Employee as Descendant on Descendant.MaterializedPath like
> > Ancestor.MaterializedPath + '%'
> > where Ancestor.EmployeeId = 1
> >
> > To find the ancestors however with the following query
> >
> > select
> > Ancestor.EmployeeId
> > from Employee as Descendant
> > inner join Employee as Ancestor on Descendant.MaterializedPath like
> > Ancestor.MaterializedPath + '%'
> > where Descendant.EmployeeId = 4
> >
> > requires a table scan.
>
> Shouldn't the above be + '.%' in both cases?

Not if you include a '.' on the end of your materialized path (as I did, although omitted accidentally for the last example)

>
> > This is very expensive for any non-trivial number of nodes where you
> > need to find a node's ancestors.
> >
> > But if you treat the MaterializedPath as an intermediate result, it is
> > a trival to use it to build a denormalized EmployeeAncestor table,
> > which for the set of data above would be as follows
> >
> > EmployeeId AncestorEmployeeId
> > 1 1
> > 2 1
> > 2 2
> > 3 1
> > 3 2
> > 4 1
> > 4 2
> > 4 4
>
> Why would you make the intermediate kludge when the join table is the
> direct representation of what you want in the first place?
>
>
> > I.e. there is one record for each of each node's ancestors (including
> > the node itself)
> >
> > This table is easy to maintain using the same trigger tha maintains the
> > MaterializedPath (although not so easy if you did not maintain the
> > MaterializedPath itself)
>
> I suggest any time you hear your self thinking "using a trigger" you
> should give your head a shake and ask whether a triggered procedure is
> really necessary.

Well, if you just use an adjacency list (which requires no trigger), anything other than direct children/parents for a tree requires recursion.

So if you often need to find all ancestors or descendants, then you will pay a big performance penalty.

> > The advantage is that it becomes very much cheaper to calculate the
> > ancestors of a node:
> >
> > select
> > Ancestor.EmployeeId
> > from
> > EmployeeAncestor Descendant
> > inner join Employee Ancestor on Ancestor.EmployeeId =
> > Descendant.AncestorEmployeeId
> > where Descendant.EmployeeId = 4
>
> Much cheaper than what? Than the kludge? In that case, one wonders why
> anyone would introduce the kludge in the first place.

How else will you find the ancestors of a node in a tree in SQL?

> > So it seems to me that the materialized path is not actually a good
> > thing
>
> Indeed.

Do please let me know a better SQL model for trees. Received on Thu May 25 2006 - 16:15:45 CEST

Original text of this message