Re: Using Materialized Path to create a paths table

From: Matthew Brealey <thelawnet_at_gmail.com>
Date: 25 May 2006 07:42:25 -0700
Message-ID: <1148568145.368709.286460_at_i39g2000cwa.googlegroups.com>


Bob Badour wrote:
> Matthew Brealey wrote:
>
> > Bob Badour wrote:
> >
> >>Matthew Brealey wrote:
> >>
> >>>I wonder if anyone has any thoughts on using the materialized path as
> >>>an intermediate step to create a kind of paths table to represent a
> >>>tree hierarchy
> >>>
> >>>Here are some records
> >>>
> >>>EmployeeId ParentEmployeeId MaterializedPath
> >>>1 <null> 1.
> >>>2 1 1.1.
> >>>3 1 1.2.
> >>>4 2 1.1.1
> >>>
> >>>etc.
> >>>
> >>>To find the descendants of a node using the materialized path is quite
> >>>low-cost: an index on MaterializedPath would be usable with the
> >>>following query
> >>>
> >>>select
> >>>Descendant.EmployeeId
> >>>from Employee as Ancestor
> >>>inner join Employee as Descendant on Descendant.MaterializedPath like
> >>>Ancestor.MaterializedPath + '%'
> >>>where Ancestor.EmployeeId = 1
> >>>
> >>>To find the ancestors however with the following query
> >>>
> >>>select
> >>>Ancestor.EmployeeId
> >>>from Employee as Descendant
> >>>inner join Employee as Ancestor on Descendant.MaterializedPath like
> >>>Ancestor.MaterializedPath + '%'
> >>>where Descendant.EmployeeId = 4
> >>>
> >>>requires a table scan.
> >>
> >>Shouldn't the above be + '.%' in both cases?
> >
> >
> > Not if you include a '.' on the end of your materialized path (as I
> > did, although omitted accidentally for the last example)
> >
> >
> >>>This is very expensive for any non-trivial number of nodes where you
> >>>need to find a node's ancestors.
> >>>
> >>>But if you treat the MaterializedPath as an intermediate result, it is
> >>>a trival to use it to build a denormalized EmployeeAncestor table,
> >>>which for the set of data above would be as follows
> >>>
> >>>EmployeeId AncestorEmployeeId
> >>>1 1
> >>>2 1
> >>>2 2
> >>>3 1
> >>>3 2
> >>>4 1
> >>>4 2
> >>>4 4
> >>
> >>Why would you make the intermediate kludge when the join table is the
> >>direct representation of what you want in the first place?
> >>
> >>
> >>
> >>>I.e. there is one record for each of each node's ancestors (including
> >>>the node itself)
> >>>
> >>>This table is easy to maintain using the same trigger tha maintains the
> >>>MaterializedPath (although not so easy if you did not maintain the
> >>>MaterializedPath itself)
> >>
> >>I suggest any time you hear your self thinking "using a trigger" you
> >>should give your head a shake and ask whether a triggered procedure is
> >>really necessary.
> >
> >
> > Well, if you just use an adjacency list (which requires no trigger),
> > anything other than direct children/parents for a tree requires
> > recursion.
> >
> > So if you often need to find all ancestors or descendants, then you
> > will pay a big performance penalty.
>
> Well then, I suggest the problem--from the perspective of the
> comp.databases.theory perspective--is your use of a product that lacks
> recursion.
>
>
> >>>The advantage is that it becomes very much cheaper to calculate the
> >>>ancestors of a node:
> >>>
> >>>select
> >>>Ancestor.EmployeeId
> >>>from
> >>>EmployeeAncestor Descendant
> >>>inner join Employee Ancestor on Ancestor.EmployeeId =
> >>>Descendant.AncestorEmployeeId
> >>>where Descendant.EmployeeId = 4
> >>
> >>Much cheaper than what? Than the kludge? In that case, one wonders why
> >>anyone would introduce the kludge in the first place.
> >
> > How else will you find the ancestors of a node in a tree in SQL?
>
> If you are only interested in SQL, why would you post your question in
> comp.databases.theory?

I do beg your pardon. However, this subject (viz. trees in SQL without recursion) is one that has been discussed in comp.databases.theory on several hundred occasions before: is there a problem discussing the theory of how to solve this without using recursion? Received on Thu May 25 2006 - 16:42:25 CEST

Original text of this message