Re: Hashing for DISTINCT or GROUP BY in SQL

From: Razvan Socol <rsocol_at_gmail.com>
Date: Sun, 17 Oct 2010 00:38:59 -0700 (PDT)
Message-ID: <01e627fb-29a0-48a5-b50d-d2858134b5aa_at_p26g2000yqb.googlegroups.com>


Hello, Joe

After reading http://en.wikipedia.org/wiki/Dynamic_perfect_hashing, I think I understand what you mean. The first idea is that not all final hashes need to have the same size. More precisely, we use the second hash function only if the first hash function provides the same result for two values. The second hash function needs to be a perfect hash function which considers the entire set of possible values for the input data, which means that the hash value may be very large. However, the second hash function won't be used very often, so the hash table is still smaller than the entire input.

Razvan Received on Sun Oct 17 2010 - 09:38:59 CEST

Original text of this message