Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: RE: RE: Find an unprintable character inside a column....
I played with this a bit.
First, I created some test data with one column corrupted with a single
random character
of 0-31 replacing a random char in that column 20% of the rows of the
table.
Peter's function correctly found all of the rows in 7.5 seconds.
Stephane's function ran in 3.5 seconds, but didn't find any of the rows. I didn't attempt to correct the code.
Then I tried a function based on owa_pattern.regex. My initial attempts didn't return the correct rows, as the regex pattern needed some tuning.
I didn't attempt to fix it, as it was woefully slow, about 30 seconds.
Regex in the WHERE clause in 10g will be nice.
Jared
"Stephane Faroult" <sfaroult_at_oriolecorp.com>
Sent by: ml-errors_at_fatcity.com
10/10/2003 07:09 AM
Please respond to ORACLE-L
To: Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com> cc: Subject: RE: RE: RE: Find an unprintable character inside a column....
>Some people have requested this code, so I thought
>you might as well all
>have the chance to pick it to bits... Its a
>function called BAD_ASCII, and
>it hunts out for any ascii characters with an ascii
>value of less than 32 in
>a specified field. (Acknowledgments to my colleague
>Keith Holmes for help
>with this code.)
>
>Use it as follows:
>
>Where a field called DATA in a table TABLE_1 may
>contain an ascci character
>with a value less than 32 (ie a non-printing
>character), the following SQL
>will find the row in question:
>
>select rowid,DATA,dump(DATA) from TABLE_1
>where BAD_ASCII(DATA) > 0;
>
>You could use the PK of the table instead of rowid,
>of course. You will also
>note that I select the DATA field in both normal
>and ascii 'dump' mode, the
>better to locate where the corruption is located.
>
>peter
>edinburgh
>...................................
>
>Source as follows:
>
>
>Function BAD_ASCII
> (V_Text in char)
> return number
>is
> V_Int number;
> V_Count number;
>begin
>--
>V_Int := 0;
>V_Count := 1;
>while V_Count<=length(rtrim(V_Text)) and V_Int=0
> loop
> if ascii(substr(V_Text, V_Count, 1))<32 then
> V_Int := V_Count;
> end if;
> V_Count := V_Count + 1;
>end loop;
>return V_Int;
>--
>exception
> when others then
> return -1;
>end BAD_ASCII;
>/
>
Peter,
I think that you can make this code 25% faster when the data is clean (which hopefully is the general case) by using 'replace', more efficient than a PL/SQL loop, to check whether you have some rubbish (sort of). It will not tell you where the bad character is, however - which means that then you can loop to look for it.
Here is what I would suggest :
create or replace Function BAD_ASCII (V_Text in char)
return number
is
V_Int number;
V_Count number;
begin
if (replace(V_text, chr(0)||chr(1)||chr(2)||chr(3)|| chr(4)||chr(5)||chr(6)||chr(7)|| chr(8)||chr(9)||chr(10)||chr(11)|| chr(12)||chr(13)||chr(14)||chr(15)|| chr(16)||chr(17)||chr(18)||chr(19)|| chr(20)||chr(21)||chr(22)||chr(23)|| chr(24)||chr(25)||chr(26)||chr(27)|| chr(28)||chr(29)||chr(30)||chr(31), '--------------------------------') = V_text)
if ascii(substr(V_Text, V_Count, 1))<32 then V_Int := V_Count; end if; V_Count := V_Count + 1;
-- exception when others then return -1; end BAD_ASCII; / Regards, Stephane Faroult Oriole -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Stephane Faroult INET: sfaroult_at_oriolecorp.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: INET: Jared.Still_at_radisys.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).Received on Fri Oct 10 2003 - 14:39:24 CDT