Extendible Hashing question

From: David Stevenson <poboxdavid_at_hotmail.com>
Date: Mon, 03 Jun 2002 12:45:04 +1200
Message-ID: <3CFABC0F.46A4389D_at_hotmail.com>



I'm not sure whether this is the right newsgroup to be asking this question, but I hope it is close enough...

Recently I implemented an extendible hashing file, and my testing indicates that it works perfectly well.

However when I checked back to an old university text book ("File Structures: An Object-Oriented Approach with C++", Folk, Zoellick, Riccardi), I noticed that they were using a technique where by they reverse the bits of the hash value when making an address to index into the directory.

For example, if you have a 1 byte hash value containing 0100 1011, and you want to index into a directory with depth of 3 bits, you would use 110 (the least significant three bits in reverse order).

However, I don't see why reversing the bits should make a difference - it seems to me to be an unnecessary complication. In my implementation I simply start by using the least significant bit to represent two buckets addresses, one containing hash values with 0 in the least significant bit, and another containing hash values with 1 in the least significant bit. As I need more address space, I use a the next higher order bit.

So if I am using 0100 1011 to index into a directory with depth of 3 bits, I simply throw away the unneeded higher bits (0100 1), leaving me with 011. When I need to use an extra bit, I split the bucket containing the 011's into two buckets: one containing 0011's, and the other containing 1011's. As I know all the hash values in the original bucket contained 011 in the lowest significant bits, I can be sure that each record will go into either the 0011 bucket or the 1011 bucket.

Essentially the only difference I can see by reversing the bits is that instead of adding extra bits to the left (as I do in my implementation), they need to add bits on to the right. However reversing the bits to make addresses seems to me to be overkill - is there any reason why you can not just throw away the unneeded higher bits?

Please tell me if I seem to be missing something!

Thanks,

David Received on Mon Jun 03 2002 - 02:45:04 CEST

Original text of this message