Re: B+ Trees

From: Clifford Heath <cjh_nospam_at_osa.com.au>
Date: Wed, 28 Nov 2001 13:31:17 +1100
Message-ID: <3C044C75.9228155A_at_osa.com.au>


A Q wrote:
> I need to know how to maintain the linked list of the leaves in a B+ Tree.

You've described a threaded B+tree. Threading is supported by most of the B+Tree implementations I've come across. In fact it's the main reason for using B+Trees instead of BTrees, because in a B+Tree all key values in internal nodes are duplicated in the leaf node, so traversing the leaves is enough to visit all keys.

> But updating such a field, as new keys are added into the B+ Tree file, is
> almost impossible (at least seems to me). Because each time new record(s)
> is/are created, I may need to update (it appears) a random number of such
> fields in different records.

No. A new leaf is always created by splitting exactly one other leaf node. You simply need to make the new pair point to each other, and ensure that the old sibling nodes with the forward and backward pointers to the old node now point to the two new nodes. These nodes might be in completely different subtrees, but you have their addresses because you had threads pointing to them, right? You do need to thread the tree in both directions to be able to do this though.

It's preferable if you do allocate two new nodes on a split, rather than moving half the keys out of an old node and re-ordering it, because then you can write the new nodes before you write the parent node, and the update is atomic (wrt node-writes). That is, the new nodes aren't part of the tree on disk until the parent node gets written. Of course, the threads in the two sibling nodes spoil all that, but at least they're easily recoverable after a crash.

--
Clifford Heath, ManageSoft Corporation
Received on Wed Nov 28 2001 - 03:31:17 CET

Original text of this message