Hash Code Based Identity of a Relational Table

From: Jaromir D.B.Nemec <jaromir_at_db-nemec.com>
Date: Tue, 5 Jun 2018 23:34:20 +0200
Message-ID: <092301d3fd14$f80c3840$e824a8c0$_at_db-nemec.com>



Hi All,  

I worked once on a project where a file had to be spooled from a database table and transferred on a remote host. I wanted to introduce a possibility of validating that the produced file has the same identical content as the source table and came across the idea of using hash codes (MD5) to achieve it.

On a next occasion when there were two tables to be verified being identical I came back to this idea and I wrote a simple user defined aggregate function based on a MD5 hash code for that task.

So basically the aggregate function MD5_XOR represented a state of the table that could be compared with other table.  

Example

select MD5_XOR(to_char(ID)||COL_TXT|| to_char(COL_DATE,'dd.mm.yyyy hh24:mi:ss')) md5 from tab1

MD5_XOR


173f1f8f85f1a154044b7629a23e949c  

The details can be found in
http://www.db-nemec.com/MD5/CompareTablesUsingMD5Hash.html  

I found that approach interesting and shared the idea with some colleagues, but the response was ranging from "may I ask what is it good for?" to "this is very unwise idea" which may be trace of the Oracle "Database Ideas" page under https://community.oracle.com/ideas/20275

This motivated me to write a white paper, where I tried to explain the idea in more detail and present some possible use cases:

http://www.db-nemec.com/MD5/HashCodeBasedIndentityintheDatabase.html

So my main point is to receive this way a more technical grounded feedback about this approach of using a hash code for representing a state of a whole table or a part of it.

Any response is highly appreciated.  

Kind Regards,  

Jaromir D.B. Nemec  

--

http://www.freelists.org/webpage/oracle-l Received on Tue Jun 05 2018 - 23:34:20 CEST

Original text of this message