Re: cyclical redundancy checksum algorithm(s)?

From: Marshall <marshall.spight_at_gmail.com>
Date: 27 Sep 2006 16:40:11 -0700
Message-ID: <1159400411.475689.122570_at_i42g2000cwa.googlegroups.com>


Karen Hill wrote:
> Gene Wirchenko wrote:
>
> > >Yet what happens if there is a collision of the checksum for a row?
> >
> > Then you get told that no change has occurred when one has. I
> > would call this an error.
>
> That's exactly what I thought when I read that in his book. I was
> thinking back to the sha1 and md5 algorithms, maybe a special crc
> algorithm is safe from this.

Because of the Pigeonhole Principle, it is not possible that some careful selection of hash algorithm will be "safe" from collisions. Collisions are always possible (assuming the input size is larger than the output size, which in practice it always is, because hashing would be pointless otherwise.) One addresses the risk of collisions probabalistically.

Although sha1 and md5 are currently a la mode, they are specifically cryptographic hashes, and as a result they have been chosen for properties that you don't care about if all you want is a regular hash. CRC is probably a fine choice; it's popular and well-understood.

The way one deals with the possibility of hash collisions is to choose an appropriate hash size for the task at hand.

In general, though, I have to agree with Gene; I don't see that this is a fruitful avenue to pursue unless there are some *very* specific circumstances that warrant it.

Marshall Received on Thu Sep 28 2006 - 01:40:11 CEST

Original text of this message