Re: pointers on representing tree in db?

From: Philip Lijnzaad <lijnzaad_at_ebi.ac.uk>
Date: 20 Apr 2001 12:10:40 +0100
Message-ID: <u7zodbrfm7.fsf_at_sol6.ebi.ac.uk>


On Thu, 19 Apr 2001 22:06:16 +0200,
"Lennart" == Lennart Jonsson <lennart_at_kommunicera.umea.se> wrote:

>>>Table Data (id integer not null, parent_id references...) id is p.k
>>>Table Ancestor (id, ancestor_id) both id and ancestor_id references id in
>>>Data

>>
>> what is Data.parent_id for ?

Lennart> Each row in the Datatable represent a treenode. parent_id is a
Lennart> reference to another row which acts as a parent for the current
Lennart> one. 

why ? this information is already in the Ancestor table. So what is the difference between Data.parent_id and Ancestor.ancestor_id ? Is this a deliberate denormalization or a typo ?

The way you present it, make you think that Data is just data on the nodes per se, not on the tree topology.

Lennart> Thus the ancestor table can be asked for the subtree or the path Lennart> from a node to root

this depends a bit on how you define tree. As mentioned earlier, the above approach (termed 'adjacency list' by Celko) is very bad at dealing with trees as sets of nodes; all you get is one tree node, or perhaps a few nodes that have a fixed number (usually 1, and at great expense more) of levels between the parent and the children. Displaying stuff with this approach requires several roundtrips to the database, which is hardly elegant, and in the case of huge trees (we're dealing with trees of > 100,000 nodes, 10 or 12 levels deep), simply too slow.

Lennart> Major drawback is that the ancestor table Lennart> grows quite fast for large trees.

why ? How else can you store the tree topology in minimal space ? I don't think you can minimize this any more (even Celko's nested set approach takes an extra id)

Lennart> I havent been able to look at the solution bye Joe Celko yet, so I cant Lennart> comment on that

Please do; it's very nifty, and the news posting I reposted should be enough to get the thing working. (The detailed description in his book tells you all the delete and update statements using standard SQL queries that seem not very practical to me). Cheers,

                                                                      Philip
-- 
Why don't Alice and Bob simply get married?
-----------------------------------------------------------------------------
Philip Lijnzaad, lijnzaad_at_ebi.ac.uk \ European Bioinformatics Institute,rm A2-08
+44 (0)1223 49 4639                 / Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax)           \ Cambridgeshire CB10 1SD,  GREAT BRITAIN
PGP fingerprint: E1 03 BF 80 94 61 B6 FC  50 3D 1F 64 40 75 FB 53
Received on Fri Apr 20 2001 - 13:10:40 CEST

Original text of this message