Path: dp-news.maxwell.syr.edu!spool.maxwell.syr.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!postnews.google.com!j55g2000cwa.googlegroups.com!not-for-mail
From: "Matthew  Brealey" <thelawnet@gmail.com>
Newsgroups: microsoft.public.sqlserver.programming,comp.databases.theory
Subject: Re: Using Materialized Path to create a paths table
Date: 25 May 2006 07:39:23 -0700
Organization: http://groups.google.com
Lines: 111
Message-ID: <1148567963.805349.302750@j55g2000cwa.googlegroups.com>
References: <1148564515.778519.28270@y43g2000cwc.googlegroups.com>
   <9Didg.1582$au4.87@trndny08>
NNTP-Posting-Host: 147.114.226.175
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1148567969 29785 127.0.0.1 (25 May 2006 14:39:29 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Thu, 25 May 2006 14:39:29 +0000 (UTC)
In-Reply-To: <9Didg.1582$au4.87@trndny08>
User-Agent: G2/0.2
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727),gzip(gfe),gzip(gfe)
X-HTTP-Via: 1.1 bgxpx0012 (NetCache NetApp/6.0.3)
Complaints-To: groups-abuse@google.com
Injection-Info: j55g2000cwa.googlegroups.com; posting-host=147.114.226.175;
   posting-account=UPl45g0AAADSh25An46Z1Jdbh86l7Zu7
Xref: dp-news.maxwell.syr.edu microsoft.public.sqlserver.programming:556491 comp.databases.theory:40055


David Cressey wrote:
> "Matthew Brealey" <thelawnet@gmail.com> wrote in message
> news:1148564515.778519.28270@y43g2000cwc.googlegroups.com...
> > I wonder if anyone has any thoughts on using the materialized path as
> > an intermediate step to create a kind of paths table to represent a
> > tree hierarchy
> >
> > Here are some records
> >
> > EmployeeId   ParentEmployeeId   MaterializedPath
> > 1                   <null>                    1.
> > 2                   1                           1.1.
> > 3                   1                           1.2.
> > 4                   2                           1.1.1
> >
> > etc.
> >
> > To find the descendants of a node using the materialized path is quite
> > low-cost: an index on MaterializedPath would be usable with the
> > following query
> >
> > select
> > Descendant.EmployeeId
> > from Employee as Ancestor
> > inner join Employee as Descendant on Descendant.MaterializedPath like
> > Ancestor.MaterializedPath + '%'
> > where Ancestor.EmployeeId = 1
> >
> > To find the ancestors however with the following query
> >
> > select
> > Ancestor.EmployeeId
> > from Employee as Descendant
> > inner join Employee as Ancestor on Descendant.MaterializedPath like
> > Ancestor.MaterializedPath + '%'
> > where Descendant.EmployeeId = 4
> >
> > requires a table scan.
> >
> > This is very expensive for any non-trivial number of nodes where you
> > need to find a node's ancestors.
> >
> > But if you treat the MaterializedPath as an intermediate result, it is
> > a trival to use it to build a denormalized EmployeeAncestor table,
> > which for the set of data above would be as follows
> >
> > EmployeeId AncestorEmployeeId
> > 1                 1
> > 2                 1
> > 2                 2
> > 3                 1
> > 3                 2
> > 4                 1
> > 4                 2
> > 4                 4
> >
> > I.e. there is one record for each of each node's ancestors (including
> > the node itself)
> >
> > This table is easy to maintain using the same trigger tha maintains the
> > MaterializedPath (although not so easy if you did not maintain the
> > MaterializedPath itself)
> >
> > The advantage is that it becomes very much cheaper to calculate the
> > ancestors of a node:
> >
> > select
> > Ancestor.EmployeeId
> > from
> > EmployeeAncestor Descendant
> > inner join Employee Ancestor on Ancestor.EmployeeId =
> > Descendant.AncestorEmployeeId
> > where Descendant.EmployeeId = 4
> >
> > The query to find descendant is similar:
> >
> > select
> > Descendant.EmployeeId
> > from
> > Employee Ancestor
> > inner join EmployeeAncestor Descendant on Descendant.AncestorEmployeeId
> > = Ancestor.EmployeeId
> > where Ancestor.EmployeeId = 1
> >
> > So it seems to me that the materialized path is not actually a good
> > thing to use to find ancestor/descendant relationships of itself, but
> > by using it to create a simple ancestor node-type table, you can get
> > slightly better performance for finding descendants, and vastly better
> > performance results for ancestors.
> >
> > Any comments?
> >
>
> I'd like to see this compared to the nested sets technique.

If your tree has a lot of inserts the nested sets left and right
numbers can be expensive to maintain as you can end up having to
re-walk the tree. For inserts, under this approach there is always
NodeDepth rows inserted into the Ancestors table, which is extremely
scaleable

In fact this is not really that related to the materialized path: in
order to build the table, you need to walk up the tree to the
ancestors, and the materialized path is really only a shortcut to find
the ancestors. You could also get the Ancestors using a trigger and a
cursor.

Using this table is very fast for querying, and for inserts the
overhead is also small.

