Re: Using Materialized Path to create a paths table

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Thu, 25 May 2006 14:24:09 GMT
Message-ID: <d6jdg.12779$A26.303050_at_ursa-nb00s0.nbnet.nb.ca>


Matthew Brealey wrote:

> Bob Badour wrote:
> 

>>Matthew Brealey wrote:
>>
>>>I wonder if anyone has any thoughts on using the materialized path as
>>>an intermediate step to create a kind of paths table to represent a
>>>tree hierarchy
>>>
>>>Here are some records
>>>
>>>EmployeeId ParentEmployeeId MaterializedPath
>>>1 <null> 1.
>>>2 1 1.1.
>>>3 1 1.2.
>>>4 2 1.1.1
>>>
>>>etc.
>>>
>>>To find the descendants of a node using the materialized path is quite
>>>low-cost: an index on MaterializedPath would be usable with the
>>>following query
>>>
>>>select
>>>Descendant.EmployeeId
>>>from Employee as Ancestor
>>>inner join Employee as Descendant on Descendant.MaterializedPath like
>>>Ancestor.MaterializedPath + '%'
>>>where Ancestor.EmployeeId = 1
>>>
>>>To find the ancestors however with the following query
>>>
>>>select
>>>Ancestor.EmployeeId
>>>from Employee as Descendant
>>>inner join Employee as Ancestor on Descendant.MaterializedPath like
>>>Ancestor.MaterializedPath + '%'
>>>where Descendant.EmployeeId = 4
>>>
>>>requires a table scan.
>>
>>Shouldn't the above be + '.%' in both cases?
> 
> 
> Not if you include a '.' on the end of your materialized path (as I
> did, although omitted accidentally for the last example)
> 
> 

>>>This is very expensive for any non-trivial number of nodes where you
>>>need to find a node's ancestors.
>>>
>>>But if you treat the MaterializedPath as an intermediate result, it is
>>>a trival to use it to build a denormalized EmployeeAncestor table,
>>>which for the set of data above would be as follows
>>>
>>>EmployeeId AncestorEmployeeId
>>>1 1
>>>2 1
>>>2 2
>>>3 1
>>>3 2
>>>4 1
>>>4 2
>>>4 4
>>
>>Why would you make the intermediate kludge when the join table is the
>>direct representation of what you want in the first place?
>>
>>
>>
>>>I.e. there is one record for each of each node's ancestors (including
>>>the node itself)
>>>
>>>This table is easy to maintain using the same trigger tha maintains the
>>>MaterializedPath (although not so easy if you did not maintain the
>>>MaterializedPath itself)
>>
>>I suggest any time you hear your self thinking "using a trigger" you
>>should give your head a shake and ask whether a triggered procedure is
>>really necessary.
> 
> 
> Well, if you just use an adjacency list (which requires no trigger),
> anything other than direct children/parents for a tree requires
> recursion.
> 
> So if you often need to find all ancestors or descendants, then you
> will pay a big performance penalty.

Well then, I suggest the problem--from the perspective of the comp.databases.theory perspective--is your use of a product that lacks recursion.

>>>The advantage is that it becomes very much cheaper to calculate the
>>>ancestors of a node:
>>>
>>>select
>>>Ancestor.EmployeeId
>>>from
>>>EmployeeAncestor Descendant
>>>inner join Employee Ancestor on Ancestor.EmployeeId =
>>>Descendant.AncestorEmployeeId
>>>where Descendant.EmployeeId = 4
>>
>>Much cheaper than what? Than the kludge? In that case, one wonders why
>>anyone would introduce the kludge in the first place.
>
> How else will you find the ancestors of a node in a tree in SQL?

If you are only interested in SQL, why would you post your question in comp.databases.theory?

>>>So it seems to me that the materialized path is not actually a good
>>>thing
>>
>>Indeed.
>
> Do please let me know a better SQL model for trees.

Ibid. Received on Thu May 25 2006 - 16:24:09 CEST

Original text of this message