Re: B+ Tree and its concept

From: paul c <toledobythesea_at_oohay.ac>
Date: Mon, 09 Oct 2006 03:20:57 GMT
Message-ID: <tejWg.110470$1T2.41089_at_pd7urf2no>


accpactec_at_hotmail.com wrote:
> Hi Everyone,
> Here is a question that I came up while reading up on b+ trees. For
> some reason I am having a problem understanding some of the terms and
> the mechanical part of the tree.
>
> So from what I understand, b+ tree is a representation of the index
> that was created from the data (data record). The tree has a root
> followed by several levels, each made up of NODES and then the leafs
> (are leafs also called nodes?). The data stored in the leaf is known as
> the DATA ENTRY which includes a search key and a RID (memory page #,
> slot #). The leafs are basically the last level of the tree which point
> to the actual data. An INDEX ENTRY is a combination <data, pointer>. A
> NODE is a combination of one or more Index Entries.
>
> If the data record is in the same sorted order as the index entry, the
> b+ tree is clustered, which means less I/Os to find a retrieve data
> records.
>
> Each node on the b+ tree including the root is at least 50% full.
> (order of the tree)
>
> The b+ tree representation of the index is stored in a file.
>
> The internal nodes of a b+tree have n data and n+1 pointers.
>
> So interms of I/O, since the b+ tree is stored in a file, if we want to
> access multiple data record, are we going to have multiple reads of the
> b+ tree file or is it that once you read the file, everything is held
> in memory and therefore, reducing the number of IOs.
>
> also how is the order determined?
>

At the risk of not understanding the sophistication of your question, let me give the layman view. You may already know this but maybe some other reader does not. The point of a btree-style index is basically divide and conquer. When a book has an index, one goes to the back pages and guesses which index page has words that start with the same letter the word desired has. Usually one does not find the word straight away and does some visual linear searching to find one or page numbers, either because many words that start with that letter happen to be on the same page or because there are many pages with that word in them. If no word appears twice in the book, then the index would be called a unique index. (If the same word appeared twice in the same page, the index would be called sparse index if it mentioned that page number only once, otherwise it would be called dense.)

An analogy to a btree might be a two-level index, really that would be two indexes, one that would contain only twenty-size letters, one for each letter of the alphabet. Obviously, this index could fit in one page (if we were talking about a real btree, the order would be pre-ordained at 26 and if that page were mapped to computer memory, would not take up much space and the db could decide to always keep it in memory and never let the OS page it out, it would be a so-called hotspot). Beside each letter, it would contain the starting page of a second index that might be like the one in the previous paragraph. One could apply this til the cows come home, eg., with lower levels, ie., other indexes that might contain the first two letters of each word and so forth.

Also, a btree does not have to be 50% full in each page. The 50% only comes about because it is easily programmed to look at the two adjacent pages when an index page that has a slot deleted goes below 50% full. Some implementations look at pages that are adjacent to the immediately adjacent pages to get storage efficiencies of 67% or 75% or more.

The storage angle with btrees is that usually even a simple cache manager will know enough to keep the frequently-used index pages in storage. This reduces IO for the same reason that the index of a printed book is usually a small fraction of the size of the rest of the book.

I believe that in general, order refers to the maximum number of slots in an index, although it would seem reasonable to refer to an average, such as 50% (but this would vary between 50 and 100% depending on what choices were made when the data was origially populated as well as the update pattern, if any).

If the btree index pages have fixed-size slots and a high order, say more than 30 or so, a binary search might be applied to search them quickly. But lots of disk-based indexes allow variable-sized slots and do a linear search on each page. If you think about a book index, this may still be orders of magnitude faster than reading the whole book to find what you want.

I believe that most people think of the b as meaning balanced, even though one of the inventors happened to be named Bayer. Either way, when it comes to cutting out expensive IOs, the btree is a darned good aspirin and he and McCreight did us all a favour by not patenting it. One of itès big advantages is that update step where it checks adjacent pages to see if two of them can be combined into one index page.

Slots do not necessarily need to contain so-called key values. Some higher-level index pages often contain just fixed-length pointers to lower-level index pages. Sometimes people take the + sign to mean that the lowest level index pages do not have fixed-length pointers, rather index values as well as pointers to actual data rows. I think this is what the IBM VSAM index method did, it was one of the early adopters of the btree idea, as was a Wang access method which was I thought superior to VSAM because it had transaction support, sorry I forget the name.

That is all I can think of for now. Sorry if this does not help answering your question, but it was good exercise for me!

magoo Received on Mon Oct 09 2006 - 05:20:57 CEST

Original text of this message