Re: pointers on representing tree in db?
Date: 23 Apr 2001 11:00:20 +0100
Message-ID: <u7vgnwq6kr.fsf_at_sol6.ebi.ac.uk>
Lennart> Each row in the Datatable represent a treenode. parent_id is a Lennart> reference to another row which acts as a parent for the current Lennart> Lennart> one.
>>
>> why ? this information is already in the Ancestor table. So what is the
>> difference between Data.parent_id and Ancestor.ancestor_id ? Is this a
>> deliberate denormalization or a typo ?
>>
Lennart> Another aproach would be to remove the id - parentid row from Lennart> Ancestor, thus keeping all ancestors but parent there.
Lennart> This would however make queries like "look for all Data that Lennart> fulfills condition x in my subtree" more complicated.
yes, certainly.
Lennart> Major drawback is that the ancestor table grows quite fast for large Lennart> trees. I grows big since adding a node at level x means adding Lennart> (x-1) rows in the ancestor table.
yes, now I see.
Lennart> Whether one can minimize it or not is an open issue.
well, not so open: Celko's nesting set representation does really minimize
this. The trick is that you use the ordering of the natural numbers to
describe subtrees implicitly, rather than explicitly using a complete
explicit description using the ancestors tables.
To me it is now clear that the solution you presented is really very wrong
(sorry to be so adamant ...). Celko's nested set representation (with the
addition of direct parent_id, which in a sense is a denormalization) gives
you this and more (ordering of the direct children, easy detection of leaf
nodes) for far less.
Cheers,
Philip
-- Why don't Alice and Bob simply get married? ----------------------------------------------------------------------------- Philip Lijnzaad, lijnzaad_at_ebi.ac.uk \ European Bioinformatics Institute,rm A2-08 +44 (0)1223 49 4639 / Wellcome Trust Genome Campus, Hinxton +44 (0)1223 49 4468 (fax) \ Cambridgeshire CB10 1SD, GREAT BRITAIN PGP fingerprint: E1 03 BF 80 94 61 B6 FC 50 3D 1F 64 40 75 FB 53Received on Mon Apr 23 2001 - 12:00:20 CEST