Re: Hashing for DISTINCT or GROUP BY in SQL
From: Razvan Socol <rsocol_at_gmail.com>
Date: Sun, 17 Oct 2010 00:38:59 -0700 (PDT)
Message-ID: <01e627fb-29a0-48a5-b50d-d2858134b5aa_at_p26g2000yqb.googlegroups.com>
Date: Sun, 17 Oct 2010 00:38:59 -0700 (PDT)
Message-ID: <01e627fb-29a0-48a5-b50d-d2858134b5aa_at_p26g2000yqb.googlegroups.com>
Hello, Joe
After reading http://en.wikipedia.org/wiki/Dynamic_perfect_hashing, I think I understand what you mean. The first idea is that not all final hashes need to have the same size. More precisely, we use the second hash function only if the first hash function provides the same result for two values. The second hash function needs to be a perfect hash function which considers the entire set of possible values for the input data, which means that the hash value may be very large. However, the second hash function won't be used very often, so the hash table is still smaller than the entire input.
Razvan Received on Sun Oct 17 2010 - 09:38:59 CEST