Home » SQL & PL/SQL » SQL & PL/SQL » scrambling production data
scrambling production data [message #360048] Wed, 19 November 2008 06:09 Go to next message
gulshangk
Messages: 3
Registered: November 2008
Junior Member
hi,

i have to scramble my production data.

like NAME , Address , pincode, emails

please can anyone help me in this.


thanks in advance
Re: scrambling production data [message #360052 is a reply to message #360048] Wed, 19 November 2008 06:21 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

Please explain your requirement in brief .

Smile
Rajuvan.
Re: scrambling production data [message #360053 is a reply to message #360048] Wed, 19 November 2008 06:28 Go to previous messageGo to next message
tahpush
Messages: 961
Registered: August 2006
Location: Stockholm/Sweden
Senior Member

gulshangk wrote on Wed, 19 November 2008 13:09
hi,

i have to scramble my production data.

like NAME , Address , pincode, emails

please can anyone help me in this.


thanks in advance

Scrambling is the a theory of the order of words in a sentence in a language in which word order is not fixed
Re: scrambling production data [message #360214 is a reply to message #360048] Thu, 20 November 2008 01:36 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

If I understand your question , If the input is 'RAJUVAN' , it should display differently for different execution of same query.

For that I tried with DBMS_RANDOM.VALUE as below .

SQL> WITH REC1 AS
  2  (  SELECT X.NM1, ROW_NUMBER() OVER (ORDER BY ROWNUM )  R1  FROM (
  3     SELECT SUBSTR('RAJUVAN',ROWNUM,1)NM1,
  4             ROWNUM
  5      FROM DUAL
  6      CONNECT BY LEVEL <=LENGTH('RAJUVAN')
  7      ORDER BY DBMS_RANDOM.VALUE(1,2))X )
  8  SELECT * FROM REC1;

N         R1
- ----------
A          1
N          2
R          3
U          4
V          5
A          6
J          7

7 rows selected.

SQL> /

N         R1
- ----------
R          1
A          2
A          3
J          4
U          5
N          6
V          7

7 rows selected.

SQL> /

N         R1
- ----------
U          1
R          2
A          3
N          4
J          5
V          6
A          7

7 rows selected.

SQL> /

N         R1
- ----------
J          1
V          2
R          3
A          4
U          5
A          6
N          7

7 rows selected.

SQL>


Till here its fine , But failed to advance from there to make it a part of hierarchical query.

SQL> SELECT * FROM
  2  (SELECT X.NM1, ROW_NUMBER() OVER (ORDER BY ROWNUM )  R1  FROM (
  3  WITH REC AS (SELECT 'ABCDEFG'NM  FROM DUAL)
  4      SELECT SUBSTR(NM,ROWNUM,1)NM1,
  5             ROWNUM
  6      FROM REC
  7      CONNECT BY LEVEL <=LENGTH(NM)
  8      ORDER BY DBMS_RANDOM.VALUE(1,2))X)
  9  START WITH R1=1
 10  CONNECT BY  R1 = PRIOR R1 +1;

N         R1
- ----------
C          1
B          2
A          3
E          4
C          5
G          6
F          7

7 rows selected.

SQL> /

N         R1
- ----------
C          1
G          2
A          3
B          4
E          5
C          6
D          7

7 rows selected.

SQL> SELECT MAX(SYS_CONNECT_BY_PATH(nm1,'/')) FROM
  2  (SELECT X.NM1, ROW_NUMBER() OVER (ORDER BY ROWNUM )  R1  FROM (
  3  WITH REC AS (SELECT 'ABCDEFG'NM  FROM DUAL)
  4      SELECT SUBSTR(NM,ROWNUM,1)NM1,
  5             ROWNUM
  6      FROM REC
  7      CONNECT BY LEVEL <=LENGTH(NM)
  8      ORDER BY DBMS_RANDOM.VALUE(1,2))X)
  9  START WITH R1=1
 10  CONNECT BY  R1 = PRIOR R1 +1;

MAX(SYS_CONNECT_BY_PATH(NM1,'/'))
----------------------------------------------------------------------

/F/C/G/F/A/B/E

SQL> /

MAX(SYS_CONNECT_BY_PATH(NM1,'/'))
----------------------------------------------------------------------

/G/C/B/A/E/G/D

SQL> /

MAX(SYS_CONNECT_BY_PATH(NM1,'/'))
----------------------------------------------------------------------

/B/C/G/E/B/A/F

SQL>


Here some value are repeated ...

Is it some thing to do with DBMS_RANDOM.VALUE ?
Or am I going wrong somewhere?

Any Guidance please .

Sad
Rajuvan.
Re: scrambling production data [message #360217 is a reply to message #360052] Thu, 20 November 2008 01:45 Go to previous messageGo to next message
gulshangk
Messages: 3
Registered: November 2008
Junior Member
Hi,

I have a production table having column NAME , Addess , Pincode , telephone number etc.
e.g.
NAME - gulshan kumar

Address - v3 ,mumbai india

pincode - 142140

Phone number - 022-24224212




We want to scramble the content of this table so that we can use this data for testing in UT enviornment.

details are as follows :

characters to be scrambled: a-z, A-Z, 0-9 (so characters like @ . / etc. to be excluded)
Lowercase character to be scrambled to another lowercase character
Uppercase character to be scrambled to another uppercase character
Numeric character to be scrambled to another numeric character.
Re: scrambling production data [message #360220 is a reply to message #360217] Thu, 20 November 2008 01:52 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
TRANSLATE

Regards
Michel
Re: scrambling production data [message #360225 is a reply to message #360048] Thu, 20 November 2008 02:04 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

I think you dont need to recover the data once it is scrumbled.

I suggest yo to use your own function for your requirement as there is no direct method for that.

TRANSLATE is a good method but you have to use about 65 translation for writing the query.

Smile
Rajuvan.
Re: scrambling production data [message #360227 is a reply to message #360225] Thu, 20 November 2008 02:08 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
TRANSLATE is a good method but you have to use about 65 translation for writing the query.

Which can be done with 1 translate call.

Regards
Michel
Re: scrambling production data [message #360244 is a reply to message #360048] Thu, 20 November 2008 03:21 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

Yes. I know. .

But wont t be better to produce the Corresponding type of literal for each type rather than hard coding value for each and every possible values ? In tat case user defined function will give an edge over Translate.Anyway it is upto OP.

As I mentioned already, the actual information may not be retrievable if you opt for a function .

Smile
Rajuvan.
Re: scrambling production data [message #360252 is a reply to message #360244] Thu, 20 November 2008 03:47 Go to previous messageGo to next message
JRowbottom
Messages: 5933
Registered: June 2006
Location: Sunny North Yorkshire, ho...
Senior Member
You can make the data irretrievable using translate if you map multiple chrs onto the same chr.

I think this Translate based solution should do the jub:
CREATE OR REPLACE PACKAGE PKG_SCRAMBLE AS
  v_from            varchar2(100);
  v_to              varchar2(100);
 
  function scramble  (p_string  in varchar2) return varchar2;
  /* TODO enter package declarations (types, exceptions, methods etc) here */

END PKG_SCRAMBLE;
/

CREATE OR REPLACE PACKAGE BODY PKG_SCRAMBLE AS

  function scramble  (p_string  in varchar2) return varchar2 AS
  BEGIN
    RETURN translate(p_string,v_from,v_to);
  END scramble;
BEGIN
  dbms_random.seed(dbms_utility.get_time);
  for i in 1..26 loop
    v_from := v_from ||chr(64+i);
    v_to   := v_to || chr(64 + floor(dbms_random.value(1,27)));
  end loop;
  
  for i in 1..26 loop
    v_from := v_from ||chr(96+i);
    v_to   := v_to ||chr(96 + floor(dbms_random.value(1,27)));
  end loop;
  
  for i in 1..10 loop
    v_from := v_from ||chr(47+i);
    v_to   := v_to ||chr(47 + floor(dbms_random.value(1,11)));
  end loop;  
END PKG_SCRAMBLE;
/
Re: scrambling production data [message #360427 is a reply to message #360048] Thu, 20 November 2008 23:13 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

Nice solution JRowBottom ..

Smile
Rajuvan.
Re: scrambling production data [message #364440 is a reply to message #360427] Mon, 08 December 2008 09:07 Go to previous messageGo to next message
gulshangk
Messages: 3
Registered: November 2008
Junior Member
Hi,

thanks for your solution ,

but i cannot use this when same output is required on giving the same input .

is there something else we can use except Random or Translate functions ?

the detailed requirement is given below :

• data to be scrambled: name, address, zipcode, customer email and phonenumber
• srambling mechanism will deliver exactly the same output when the same input is given

• scrambling result is not allowed to be reversible (that is: on the basis of the output the input can not be derived)
• the position of the character must be included in the scrambling algoritm (for example: scrambling of "A" on position 1 will give another result then scrambling of "A" on position 2).
• characters to be scrambled: a-z, A-Z, 0-9 (so characters like @ . / etc. to be excluded)
• lowercase character to be scrambled to another lowercase character
• uppercase character to be scrambled to another uppercase character
• numeric character to be scrambled to another numeric character
Re: scrambling production data [message #364442 is a reply to message #364440] Mon, 08 December 2008 10:24 Go to previous messageGo to next message
annagel
Messages: 220
Registered: April 2006
Senior Member
If you take JRprBottom's solution and change:

dbms_random.seed(dbms_utility.get_time);


to

dbms_random.seed(p_string);


You should get reproducibility. Though you should keep in mind that ireversability (spelling?) is not possible with the requirement that the same input string produce the same output. All you can hope to do is make reversibility sufficiently difficult.

EDIT: Took a closer look at JR's function, it would require moving all the seed code into the main function so v_from and v_to get refreshed every time you call the scramble function. Coincidentally JR's function DOES provide reproducibility for a single session if this is all you need.

[Updated on: Mon, 08 December 2008 10:27]

Report message to a moderator

Re: scrambling production data [message #364443 is a reply to message #364440] Mon, 08 December 2008 10:27 Go to previous messageGo to next message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
You have the requirements.
You have the tools (dbms_random, translate).

Good luck to write the package, hope you will share it with us once you're gone.

Regards
Michel
Re: scrambling production data [message #364446 is a reply to message #364440] Mon, 08 December 2008 10:43 Go to previous messageGo to next message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
O.K the way I have understood so far is basically you want to scramble the data ( Can I call it encrypt the data). If my assumption is correct you could potentially take advantage of the builtin packages like dbms_crypto. You could use the encryption method of your choice based on your requirement.

Hope this helps.

Regards

Raj
Re: scrambling production data [message #364453 is a reply to message #364446] Mon, 08 December 2008 10:55 Go to previous message
Michel Cadot
Messages: 64109
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Unfortunatly dbms_crypto returns a binary value so hard to fulfill requirements such as lower case characters give lower case characters.

Regards
Michel
Previous Topic: cartesian join with correlated queries
Next Topic: Creating Indexes
Goto Forum:
  


Current Time: Mon Dec 05 06:46:08 CST 2016

Total time taken to generate the page: 0.09995 seconds