Re: Hashing for DISTINCT or GROUP BY in SQL

From: Clifford Heath <no_at_spam.please.net>
Date: Wed, 13 Oct 2010 18:07:53 +1000
Message-ID: <4cb4dc1b$0$316$afc38c87_at_news.optusnet.com.au>

-CELKO- wrote:
> But two different values can have the same
> hash for any one hashing function.

So use the hash to reduce the number of key values you have to compare normally.

> Does there exist a set of hashing functions, H1(), H2(), .., Hn()
> which will produce at least one different result for any pair of data
> values?

A set of such hash functions, if you concatenate their values, would be just one hash function. Unless the result contains all the entropy that was in the parameters, you'll get collisions. If it does contain all the entropy, there's no advantage in hashing.

I'm sure I've seen query plans back as far as MS SQL Server 2000 that use a hash for grouping/distinct, but my memory may have failed me.

Clifford Heath. Received on Wed Oct 13 2010 - 10:07:53 CEST

This message: [ Message body ]
Next message: -CELKO-: "Re: Hashing for DISTINCT or GROUP BY in SQL"
Previous message: Joe Thurbon: "Re: The Relational Model & Queries That Naturally Return Duplicate Rows"
In reply to: -CELKO-: "Hashing for DISTINCT or GROUP BY in SQL"
In reply to -CELKO-: "Hashing for DISTINCT or GROUP BY in SQL"
Next in thread: -CELKO-: "Re: Hashing for DISTINCT or GROUP BY in SQL"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message