Re: matrix encoding IS adjacency list

From: VC <boston103_at_hotmail.com>
Date: Mon, 19 Sep 2005 22:50:45 -0400
Message-ID: <2aydnbeuzNmY5LLeRVn-iA_at_comcast.com>


"Vadim Tropashko" <vadimtro_invalid_at_yahoo.com> wrote in message news:1127179828.730374.139640_at_g14g2000cwa.googlegroups.com...
> VC wrote:
>> OK, at first you talked about an adjacency list, but no matter. Let
>> a,b,c,d,e be some nodes. Then an adjacency list for some DAG/tree might
>> be
>> a set of pairs:
>>
>> {(a, {b,c}), (b, {d,e}) or, equivalently, {(a,b), (a,c), (b,d), (b,e)}
>>
>> In the RM, an adjacency list is usually emulated by a (child, parent,
>> data) relation with the 'child' attribute being the primary key. One can
>> enforce a referential integrity constraint if need be, but that's
>> although
>> desirable is not required in order to express a hypothetical tree
>> topology.
>
> This terminology is blamed on Celko:-) Seriously, we are talking about
> the relation analogous to adjacency matrix.

Please explain.

>
>> >Where do you see this constraint
>> > with materialized path encoding? In other words, given parent node
>> > materialized path encoding, how do I find all the (immediate) children?
>> > Trivial with adjacency list.
>>
>> It's as trivial with a materialized path: in the above example in order
>> to
>> find all node a.b 's children, just say 'find tuples where
>> prefix(materialized_path) = a.b'.
>
> Nope. You query looks for descendants, not (immediate) children. You
> can say: "aha, just add a postfilter", or in other words find all the
> descendants, that are on the next level, but that is not guaranteed to
> be efficient.

Sorry, yes, that's what I meant.

>For the root node the descendants index scan would return
> the whole hierarchy, and then you'd have to filter out almost all of
> them.
>
>> > Referring to the problem of (a11,a21) being not unique, I actually
>> > figured it out. There are exactly 2 nodes with the same (a11,a21). This
>> > follows from the equation
>> >
>> > a11*a22-a21*a12 = 1 or -1
>>
>> This actually follows from the fact that a rational number, as a
>> continued
>> fraction, has two encodings (if we are talking about the same thing).
>
> Yes
>
>> > It is possible to amend matrix encoding in such a way that determinant
>> > of [[a11,a12],[a21,a22]] matrix is always 1. Then, (a11,a21) is unique,
>> > and we can decare foreing key constraint. Matrix encoding IS adjacency
>> > list!
>>
>> Of course it is, m.e being a materialized path encoding. You seem to
>> imply
>> that the constraint makes finding, say, all the immediate children
>> somehow
>> easier compared to the constraint-less search. Could you please
>> elaborate ?
>
> Once again, your query should be written as
>
> 'find tuples where prefix(materialized_path) = a.b
> and length(materialized_path)=3'
>
> which is not the same as
>
> 'find tuples where parent_id = 5'
>

Well, nothing is for free. For certain queries, an m.p. encoding may cause performance issues that can be resolved by maintaining a separate/redundant parent attribute, creating a function-based index, or using the real adjacency list encoding. There is no universal solution for traversing hierarchies, unfortunately (unless you want to use Oracle's proprietary 'connect by' or the standard recursive query (DB2/SQL Server 2005).

Still, you did not show how the referential constraint helps with finding immediate children in the matrix encoding.

I can see how you would use the parent pair of numbers, of course. Btw, the pair is redundant: it can be deduced from the child pair if disambiguated by a boolean value indicating which, out of two, rational encoding you want to use. Received on Tue Sep 20 2005 - 04:50:45 CEST

Original text of this message