Re: matrix encoding IS adjacency list

From: Vadim Tropashko <vadimtro_invalid_at_yahoo.com>
Date: 20 Sep 2005 10:22:24 -0700
Message-ID: <1127236944.777515.52200_at_g47g2000cwa.googlegroups.com>


VC wrote:
> "Vadim Tropashko" <vadimtro_invalid_at_yahoo.com> wrote in message
> news:1127186069.165059.38590_at_g14g2000cwa.googlegroups.com...
> > VC wrote:
> >> "Vadim Tropashko" <vadimtro_invalid_at_yahoo.com> wrote in message
> >> news:1127179828.730374.139640_at_g14g2000cwa.googlegroups.com...
> >> > This terminology is blamed on Celko:-) Seriously, we are talking about
> >> > the relation analogous to adjacency matrix.
> >>
> >> Please explain.
> >
> > http://en.wikipedia.org/wiki/Adjacency_matrix
> >
> > ???

>

> Did you read the reference you've supplied ? The a.m. is just an NxN matrix
> for N nodes with a bunch of zeroes for non-adjacent vertices. The a.l. is an
> optimization if an a.m. is sparse, Both contain the same information. How
> does the definition you've referenced answer my request for clarifiication
> of "... we are talking about the relation analogous to adjacency matrix" ?

I'm not sure I understand what clarification do you want. The representation that you supplied

{(a, {b,c}), (b, {d,e}) }

is not even in 1NF, therefore, isn't worth discussing. I thought that it surfaced because of unfortunate term "adjacency list".

The adjacency relation is

table Edges (

    head integer,
    tail integer
)

For trees graph edges are joined with nodes so that the adjacency relation tree design is

table TreeNode (

    id integer,
    parent_id integer
)

> >> Well, nothing is for free. For certain queries, an m.p. encoding may
> >> cause
> >> performance issues that can be resolved by maintaining a
> >> separate/redundant
> >> parent attribute, creating a function-based index, or using the real
> >> adjacency list encoding.
> >
> > That's what the programmers in the field say: "Let's combine nested
> > intervals/sets with adjacency list to have the best of both worlds.
> > Little data redundancy would help speeding up some queries". It is not
> > obvious that for matrix encoding you gain exactly nothing this way.

>

> Do you mean that we gain something or that we gain nothing by adding the
> parent id (adjacency list) ?

For matrix encoding

table MatrixNode (

    a11 integer,
    a12 integer,
    a21 integer,

    a22 integer
)

adding extra columns this way

table MatrixNode (

    a11 integer,
    a12 integer,
    a21 integer,
    a22 integer,

    id integer,
    parent_id integer
)

or this way

table MatrixNode (

    a11 integer,
    a12 integer,
    a21 integer,
    a22 integer,
    parent_a11 integer,
    parent_a12 integer,

    parent_a21 integer,
    parent_a22 integer
)

is unnecessary.

> >> Still, you did not show how the referential constraint helps with
> >> finding
> >> immediate children in the matrix encoding.
> >
> > Find all the children of [[15,-2],[8,-1]]:
> >
> > select * from MatrixNode
> > where a12 = -15 and a22=-8
> >

>

> a. What does the example have to do with either having or not having a
> constraint ?
> b. That's what I wrote in my prev. message since (a12, a22) are just a
> parent id.

If we don't have the constraint, then, the only way to query node's children is

select * from MatrixNode
where parent(a11,a12,a21,a22)=[[15,-2],[8,-1]]

similar to m.p.

By "having the constraint" i didn't meant enforcing it. I meant discovering it.

> >> I can see how you would use the parent pair of numbers, of course. Btw,
> >> the pair is redundant: it can be deduced from the child pair if
> >> disambiguated by a boolean value indicating which, out of two, rational
> >> encoding you want to use.
> >
> > No need to disambiguate. (a11, a21) is a unique key, if matrices with
> > determinant 1 are used.

>

> The [ambiguous] materialized path is represented by (a11, a12) if I
> understand your encoding correctly. The parent id, (a12, a21) is used to
> disambiguate (a11,a12).

It could be either (a11, a12) or (a11, a21). By applying extended euclidean algorithm to eiter one we got m.p. -- they would be reverse of each other. There is an ambiguity if we use matrices that are products of atomic matrices

[n  1]
[    ]
[1  0]

Those atomic matrices have determinant -1 so that the determinant of matrix product alternates. If we use atomic matrices

[n+1  -1]
[       ]
[       ]
[1     0]

the determinant is 1 always, therefore, we don't have this ambiguity. This hinges on the fact that a solution for

a*y-b*x=1

in positive integers x,y such that x<a, y<b is unique.

As a concequence, we can have a different perspective: a node can be encoded with just two numbers. Adding two more [redundant] numbers referring to the node's parent, organizeses all four elements into a nice matrix structure.

>

> How is it better than specifying explicitely both the m.p. and the parent id
> ?

Less redundancy, and therefore, less complexity? Well, until recently, I had impression that matrix/continued fraction theory is indeed parallel to materialized path. Received on Tue Sep 20 2005 - 19:22:24 CEST

Original text of this message