Re: how to suppress carefully a recursive tree

From: Jan Hidders <hidders_at_gmail.com>
Date: Tue, 22 Jan 2008 06:52:27 -0800 (PST)
Message-ID: <7411b3c1-cb63-40c8-8f1b-533a7ac0407f_at_x69g2000hsx.googlegroups.com>


On 22 jan, 12:04, fj <francois.j..._at_irsn.fr> wrote:
> I know how to suppress a normal tree but I meet the following kind of
> situation :

I'm guessing that when you say "suppress" you mean "represent in a database". Correct?

>
> r1
>     -> b1
>         -> b2 -> ...
>         -> b3 -> ...
>     -> b2
>         -> b1 -> ...
>         -> b1 -> ...
>     -> b3
>         -> b4
>
> r2
>     -> b3 -> ...

So, a directed graph with multiple roots. Correct? So basically you want to store arbitrary directed graphs.

> A same node can be referenced at several places. Each node is
> associated to a storage count :
>
> r1(0)  b1(3)  b2(2)  b3(3)  b4(1)  r2(0)

Which would correspond to the number of incoming edges, yes?

> In a normal tree without recursion (in the example above, recursion
> occurs because b1 contains b2 and vice versa), a node is destroyed
> when its count storage is equal to zero else its count storage is
> simply decremented.
>
> What algorithm should be applied ? I want for instance to cleanup r1
> but, of course, r2 must remain valid (=> b3 and b4 are not destroyed
> during the process and their storage count must be b3(1) b4(1)).
>
> Notice that the deletion of of a tree must be possible even if the
> count storage of the root is not equal to zero :
>    r1 -> b1 -> b2 -> r1 -> ...

It all depends a bit on how large your typical graphs are, how long on average the simple paths, what type of operations and queries you want to do on it and how often. My first guess for the representation would a simple straightforward adjacency list representation, (a binary relation that contains all the edges) and if it's not too big and your paths are often long it might be interesting to maintain an extra table with the transitive closure of the graph.

If you go for the adjacency list approach, make sure that you do as much as possible in one SQL statement when you start following the edges. So look up all nodes that are reachable in one step in one statement, update those, and store the ones that have to be deleted in a temporary table. Then again with one statement look up those that can be reached from in one step from the nodes in the temporary table. Et cetera.

Does that help?

PS. Expect a shameless but informative plug by Joe Celko for one of his books. :-)

  • Jan Hidders
Received on Tue Jan 22 2008 - 15:52:27 CET

Original text of this message