# Re: how to suppress carefully a recursive tree

Date: Tue, 22 Jan 2008 08:10:35 -0800 (PST)

Message-ID: <dbbf044e-fdbb-4a96-9a52-7c1682da2bf9_at_v4g2000hsf.googlegroups.com>

On 22 jan, 15:52, Jan Hidders <hidd..._at_gmail.com> wrote:

> On 22 jan, 12:04, fj <francois.j..._at_irsn.fr> wrote:

*>
**> > I know how to suppress a normal tree but I meet the following kind of
**> > situation :
**>
**> I'm guessing that when you say "suppress" you mean "represent in a
**> database". Correct?
*

No : I want to destroy, remove, kill ... a part of the data (a complete tree or just a branch), but without destroying data shared by other trees or branches.

*>
**>
**>
**> > r1
**> > -> b1
*

> > -> b2 -> ...

*> > -> b3 -> ...
**> > -> b2
**> > -> b1 -> ...
**> > -> b1 -> ...
**> > -> b3
**> > -> b4
**>
**> > r2
**> > -> b3 -> ...
**>
**> So, a directed graph with multiple roots. Correct? So basically you
**> want to store arbitrary directed graphs.
*

Yes and no. The two "roots" r1 r2 could perhaps belong to a bigger tree.

*>
*

> > A same node can be referenced at several places. Each node is

*> > associated to a storage count :
**>
**> > r1(0) b1(3) b2(2) b3(3) b4(1) r2(0)
**>
**> Which would correspond to the number of incoming edges, yes?
*

Yes

*>
*

> > In a normal tree without recursion (in the example above, recursion

*> > occurs because b1 contains b2 and vice versa), a node is destroyed
**> > when its count storage is equal to zero else its count storage is
**> > simply decremented.
**>
**> > What algorithm should be applied ? I want for instance to cleanup r1
**> > but, of course, r2 must remain valid (=> b3 and b4 are not destroyed
**> > during the process and their storage count must be b3(1) b4(1)).
**>
**> > Notice that the deletion of of a tree must be possible even if the
**> > count storage of the root is not equal to zero :
**> > r1 -> b1 -> b2 -> r1 -> ...
**>
**> It all depends a bit on how large your typical graphs are, how long on
**> average the simple paths, what type of operations and queries you want
**> to do on it and how often. My first guess for the representation
**> would a simple straightforward adjacency list representation, (a
**> binary relation that contains all the edges) and if it's not too big
**> and your paths are often long it might be interesting to maintain an
**> extra table with the transitive closure of the graph.
*

The graph may be quite large. Number of vertices (nodes) : usually 100000, sometimes much more (a very big computation may lead to about 100 millions). This corresponds to a 3D meshing, each mesh (a particular node) containing information about chemical composition (a sub-node), temperature, fluid characteristics (another sub-node) ...

Let us precise that simple paths are always short when one excludes recursive points (a maximum of 10 nodes).

> If you go for the adjacency list approach, make sure that you do as

*> much as possible in one SQL statement when you start following the
**> edges. So look up all nodes that are reachable in one step in one
**> statement, update those, and store the ones that have to be deleted in
**> a temporary table. Then again with one statement look up those that
**> can be reached from in one step from the nodes in the temporary table.
**> Et cetera.
*

A short cut would be to store the external counts in the nodes themselves but, unfortunately, this is (more or less) forbidden : the data base needs to work in a // environment (openMP) and two simultaneous calls to the deletion routine with two trees sharing data could lead to crashing the application. Anyway, this solution will be adopted if I don't find a better algorithm (the deletion routine will be protected by a semaphore blocking other threads wanting to destroy something).

I just wanted to know if a better algorithm was available. The problem is more or less connected to garbage collector techniques. Python language should have the same trouble when trying to destroy a dictionary (variables of a dictionary may belong to another dictionary).

*>
*

> Does that help?

*>
**> PS. Expect a shameless but informative plug by Joe Celko for one of
**> his books. :-)
**>
**> -- Jan Hidders
*

Received on Tue Jan 22 2008 - 17:10:35 CET