Re: Hash Code Based Identity of a Relational Table

From: Andy Sayer <andysayer_at_gmail.com>
Date: Tue, 5 Jun 2018 22:48:33 +0100
Message-ID: <CACj1VR5u7qAjW=B0ZRC=g7WhTYDU-L-tOaguZBZhNp5saUWF4w_at_mail.gmail.com>



Hi Jaromir,

There is certainly a benefit to being able to compare things in two different locations without sending the entire thing over the wire again. You might be very annoyed though, Oracle has already written something for this, it's not very well documented but it's there: dbms_sqlhash https://docs.oracle.com/cd/B19306_01/network.102/b14266/appendixb.htm#CHDGHCAE

You just give it a query which can just be a select * statement. It computes the hash of each row (by concatenating the columns together as a string with a space between each column) and then combines that for a hash of the complete row set at the very end. You need to be careful about row ordering and any NLS style parameters that will change the implicit conversion of any data type you have in your table.

Thanks,
Andrew

On 5 June 2018 at 22:34, Jaromir D.B.Nemec <jaromir_at_db-nemec.com> wrote:

> Hi All,
>
>
>
> I worked once on a project where a file had to be spooled from a database
> table and transferred on a remote host. I wanted to introduce a possibility
> of validating that the produced file has the same identical content as the
> source table and came across the idea of using hash codes (MD5) to achieve
> it.
>
> On a next occasion when there were two tables to be verified being
> identical I came back to this idea and I wrote a simple user defined
> aggregate function based on a MD5 hash code for that task.
>
> So basically the aggregate function MD5_XOR represented a state of the
> table that could be compared with other table.
>
>
>
> Example
>
> select MD5_XOR(to_char(ID)||COL_TXT|| to_char(COL_DATE,'dd.mm.yyyy
> hh24:mi:ss')) md5 from tab1
>
> MD5_XOR
>
> -------------------------------
>
> 173f1f8f85f1a154044b7629a23e949c
>
>
>
> The details can be found in http://www.db-nemec.com/MD5/
> CompareTablesUsingMD5Hash.html
>
>
>
> I found that approach interesting and shared the idea with some
> colleagues, but the response was ranging from “may I ask what is it good
> for?” to “this is very unwise idea” which may be trace of the Oracle
> “Database Ideas” page under https://community.oracle.com/ideas/20275
>
> This motivated me to write a white paper, where I tried to explain the
> idea in more detail and present some possible use cases:
>
> http://www.db-nemec.com/MD5/HashCodeBasedIndentityintheDatabase.html
>
> So my main point is to receive this way a more technical grounded feedback
> about this approach of using a hash code for representing a state of a
> whole table or a part of it.
>
> Any response is highly appreciated.
>
>
>
> Kind Regards,
>
>
>
> Jaromir D.B. Nemec
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jun 05 2018 - 23:48:33 CEST

Original text of this message