Home » SQL & PL/SQL » SQL & PL/SQL » calculate a function on whole table (oracle 11g)
calculate a function on whole table [message #673189] Sat, 10 November 2018 14:11 Go to next message
ramkumar10
Messages: 12
Registered: March 2017
Junior Member
Hello,

I have a sql related question just to know if this is possible or not.

I have a table with many number of columns and different datatypes namely varchar, date, number etc.

is it possible to apply hash on whole table

what i mean is this for example if i have a file i can use md5sum in unix to get the whole file hash value in the same way in oracle i can get hash of each row by using dbms.obfuscation package and passing the columns inside

but what if i want the hash for a set of rows is it possible.

Thanks for time.
Re: calculate a function on whole table [message #673190 is a reply to message #673189] Sat, 10 November 2018 14:23 Go to previous messageGo to next message
BlackSwan
Messages: 26274
Registered: January 2009
Location: SoCal
Senior Member
ramkumar10 wrote on Sat, 10 November 2018 12:11
Hello,

I have a sql related question just to know if this is possible or not.

I have a table with many number of columns and different datatypes namely varchar, date, number etc.

is it possible to apply hash on whole table

what i mean is this for example if i have a file i can use md5sum in unix to get the whole file hash value in the same way in oracle i can get hash of each row by using dbms.obfuscation package and passing the columns inside

but what if i want the hash for a set of rows is it possible.

Thanks for time.

Possible? Yes.
It is possible to establish a Virtual Column that contains a hash of that row columns.
Doing so will be VERY resource intensive since EVERY subsequent DML requires new hash to be computed.
Then you could compute new hash containing as many rows as you desire.
Should you? Possibly NOT.

What problem are you really trying to solve?
Re: calculate a function on whole table [message #673193 is a reply to message #673190] Sat, 10 November 2018 15:14 Go to previous messageGo to next message
Michel Cadot
Messages: 65946
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
Doing so will be VERY resource intensive since EVERY subsequent DML requires new hash to be computed.
Well, if it is a virtual column only queries (or WHERE clauses) using this column will.

Anyway, this may indeed not be the best way to fulfill the requirements.

Re: calculate a function on whole table [message #673194 is a reply to message #673190] Sat, 10 November 2018 15:40 Go to previous messageGo to next message
ramkumar10
Messages: 12
Registered: March 2017
Junior Member
Thanks for the reply well I am extracting data from an oracle table and loading the data into flatfiles using different etl tools. here I am tasked with the burden of proving that there has not been any content change while extracting and loading into flat files so my initial thought was to hash the each row and store it as a virtual column i know this is very intensive so i thought why not hash the whole table as I do with flat files on unix. my little understanding of database tables is effectively backend everything is a file so thought cross to me why not hash the whole table to show that there is no data changed while etl.

Thanks hopefully My explanation is clear.
Re: calculate a function on whole table [message #673197 is a reply to message #673194] Sun, 11 November 2018 01:09 Go to previous messageGo to next message
Michel Cadot
Messages: 65946
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Just make hashes on each row or globally on all rows won't work as hash functions work on bytes not on external (human) values.
So if you have not the same character set in the file and in the database then there will have changes in bytes; numbers and dates are stored in an internal format not on human one.
So, if you want to compare hashes, you have to convert each column value in the EXACT same format that it was in the file before hashing. Are you sure you are able to do that? Are you sure you know the EXACT format of each value in the file?

[Edit: English]

[Updated on: Sun, 11 November 2018 01:48]

Report message to a moderator

Re: calculate a function on whole table [message #673198 is a reply to message #673194] Sun, 11 November 2018 01:38 Go to previous messageGo to next message
John Watson
Messages: 7659
Registered: January 2010
Location: Global Village
Senior Member
Quote:
I am tasked with the burden of proving that there has not been any content change while extracting and loading into flat files
I think that you have been asked to do something that is impossible, because there will be content change as a part of this process. Indeed, the whole process is designed to do content change: converting data from one format to another. Many sites rely on a simple row count check which will survive the process (read consistency permitting) would that not be adequate?
Re: calculate a function on whole table [message #673200 is a reply to message #673198] Sun, 11 November 2018 11:26 Go to previous messageGo to next message
ramkumar10
Messages: 12
Registered: March 2017
Junior Member
John Watson wrote on Sun, 11 November 2018 01:38
Quote:
I am tasked with the burden of proving that there has not been any content change while extracting and loading into flat files
I think that you have been asked to do something that is impossible, because there will be content change as a part of this process. Indeed, the whole process is designed to do content change: converting data from one format to another. Many sites rely on a simple row count check which will survive the process (read consistency permitting) would that not be adequate?
Technically yes you are right and that was my initial solution but now I have been asked for content validation as well which in my particular case is very hard as I am using a single job to extract from many tables dynamically.

But I agree with you there must be more discussion with my team on this requirement.
Re: calculate a function on whole table [message #673201 is a reply to message #673197] Sun, 11 November 2018 11:29 Go to previous messageGo to next message
ramkumar10
Messages: 12
Registered: March 2017
Junior Member
Michel Cadot wrote on Sun, 11 November 2018 01:09

Just make hashes on each row or globally on all rows won't work as hash functions work on bytes not on external (human) values.
So if you have not the same character set in the file and in the database then there will have changes in bytes; numbers and dates are stored in an internal format not on human one.
So, if you want to compare hashes, you have to convert each column value in the EXACT same format that it was in the file before hashing. Are you sure you are able to do that? Are you sure you know the EXACT format of each value in the file?

[Edit: English]
I am planning to hash each row and from table and make hashes from each row of the file and try to compare them and see if i can get a match which i highly doubt as you pointed out I should know the exact datatype of each column but it is worth a shot and I will follow up here with the results i get.
Re: calculate a function on whole table [message #673202 is a reply to message #673201] Sun, 11 November 2018 11:58 Go to previous messageGo to next message
Michel Cadot
Messages: 65946
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Not only datatype but also format.
For example, when you have 10 (number) in a field, was it "10", "10.0", "10.00", "10,00" in the file?

Not only datatype but also character set.
For example, my Windows is using code page 1252 (which is WE8MSWIN1252 in Oracle), if I have a "é" in a file in the Windows character set, its code point (its binary data) is 233 = xE9, but my database uses AL32UTF8 character set, so the code point for the same character is 195,169 (2 bytes) = xC3A9. So when when an insert a "é" in the database, Oracle inserts 2 bytes 195 and 169, then you see you cannot compare the MD5 values between the 2 binary values which are for the same character.

[Updated on: Sun, 11 November 2018 12:00]

Report message to a moderator

Re: calculate a function on whole table [message #673233 is a reply to message #673202] Mon, 12 November 2018 10:56 Go to previous messageGo to next message
ramkumar10
Messages: 12
Registered: March 2017
Junior Member
That's a great point did not notice it until you pointed it out. When a file is generated number 10 inside the number file is 000000000000000010 some thing like this based on the precision defined in the database so hash values of the table and the script will never be equal.

Thanks for the insights.
Re: calculate a function on whole table [message #673257 is a reply to message #673189] Tue, 13 November 2018 13:50 Go to previous messageGo to next message
JPBoileau
Messages: 50
Registered: September 2017
Member
While not the most efficient way, I would possibly consider this:

Extract the current data to a flatfile.
Manipulate the data.
Load the flatfile back into a staging table and verify that the data is consistent with the original table after manipulation.

JP
Re: calculate a function on whole table [message #673259 is a reply to message #673257] Tue, 13 November 2018 18:41 Go to previous message
ramkumar10
Messages: 12
Registered: March 2017
Junior Member
JPBoileau wrote on Tue, 13 November 2018 13:50
While not the most efficient way, I would possibly consider this:

Extract the current data to a flatfile.
Manipulate the data.
Load the flatfile back into a staging table and verify that the data is consistent with the original table after manipulation.

JP
1.I have only read access on database so writing back to a table is out of question
2.I would not like to manipulate the data in flat file as it defeats the point of content validation.
3.I know I am contradicting myself but I think it is neither wise nor possible to compare content between a table and file without manipulation so that answers my question.

Thanks for the suggestion's cheers.
Previous Topic: PL/SQL error Ora-06502
Next Topic: resource busy and acquire with NOWAIT specified or timeout expired
Goto Forum:
  


Current Time: Tue Nov 13 23:24:16 CST 2018