Re: cyclical redundancy checksum algorithm(s)?

From: Gene Wirchenko <genew_at_ocis.net>
Date: Wed, 27 Sep 2006 13:25:27 -0700
Message-ID: <s9nlh25a9n2ik09c666tq9mt1u85vsnmov_at_4ax.com>


"Karen Hill" <karen_hill22_at_yahoo.com> wrote:

>I just finished reading one of Ralph Kimball's books. In it he
>mentions something called a cyclical redundancy checksum (crc)
>function. A crc function is a hash function that generates a checksum.
>
>I am wondering a few things. A crc function would be extremely useful
>and time saving in determining if a row needs to be updated or not (are
>the values the same, if yes don't update, if not update). In fact
>Ralph Kimball states that this is a way to check for changes. You just
>have an extra column for the crc checksum. When you go to update data,
>generate a crc checksum and compare it to the one in the crc column.
>If they are same, your data has not changed.
>
>Yet what happens if there is a collision of the checksum for a row?

     Then you get told that no change has occurred when one has. I would call this an error.

>Ralph Kimball did not mention which algorithm to use, nor how to create
>a crc function that would not have collisions. He does have a PhD,
>and a leader in the OLAP datawarehouse world, so I assume there is a
>good solution.

     Your error. Having a Ph.D. does not stop someone from being wrong.

>Is there a crc function in postgresql? If not what algorithm would I
>need to use to create one in pl/pgsql?

     I think you are focusing on irrelevant minutiae. Is the performance really that bad that you have go to odd lengths to up it? If you think so, is this because you have actually tested it, or is it just a feeling? You could be setting yourself up for a lot of work that may be error-prone and just plain not work and for very little.

Sincerely,

Gene Wirchenko

Computerese Irregular Verb Conjugation:

     I have preferences.
     You have biases.
     He/She has prejudices.
Received on Wed Sep 27 2006 - 22:25:27 CEST

Original text of this message