Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: RE: RE: Find an unprintable character inside a column....

RE: RE: RE: Find an unprintable character inside a column....

From: Mladen Gogala <mladen_at_wangtrading.com>
Date: Fri, 10 Oct 2003 12:44:26 -0800
Message-ID: <F001.005D2B8D.20031010124426@fatcity.com>


Actually, I was toying with the idea of writing an external procedure that would allow me to call pcre library (PCRE=Perl Compatible Regular Expressions) which would be nice, but then again, the whole perl is available through the set of external procedures, so it wouldn't be very useful. External procedures can be used in the where clause, provided they're declared as deterministic. Actually, it wouldn't be that hard to extend 9.2 database with regular expressions.

On Fri, 2003-10-10 at 15:39, Jared.Still_at_radisys.com wrote:
> I played with this a bit.
>
> First, I created some test data with one column corrupted with a
> single random character
> of 0-31 replacing a random char in that column 20% of the rows of the
> table.
>
> Peter's function correctly found all of the rows in 7.5 seconds.
>
> Stephane's function ran in 3.5 seconds, but didn't find any of
> the rows. I didn't attempt to correct the code.
>
> Then I tried a function based on owa_pattern.regex. My initial
> attempts
> didn't return the correct rows, as the regex pattern needed some
> tuning.
>
> I didn't attempt to fix it, as it was woefully slow, about 30 seconds.
>
> Regex in the WHERE clause in 10g will be nice.
>
> Jared
>
>
>
>
> "Stephane Faroult"
> <sfaroult_at_oriolecorp.com>
> Sent by:
> ml-errors_at_fatcity.com
>
> 10/10/2003 07:09 AM
> Please respond to
> ORACLE-L
>
> To:
> Multiple recipients of
> list ORACLE-L
> <ORACLE-L_at_fatcity.com>
> cc:
> Subject:
> RE: RE: RE: Find an
> unprintable character
> inside a column....
>
>
> >Some people have requested this code, so I thought
> >you might as well all
> >have the chance to pick it to bits... Its a
> >function called BAD_ASCII, and
> >it hunts out for any ascii characters with an ascii
> >value of less than 32 in
> >a specified field. (Acknowledgments to my colleague
> >Keith Holmes for help
> >with this code.)
> >
> >Use it as follows:
> >
> >Where a field called DATA in a table TABLE_1 may
> >contain an ascci character
> >with a value less than 32 (ie a non-printing
> >character), the following SQL
> >will find the row in question:
> >
> >select rowid,DATA,dump(DATA) from TABLE_1
> >where BAD_ASCII(DATA) > 0;
> >
> >You could use the PK of the table instead of rowid,
> >of course. You will also
> >note that I select the DATA field in both normal
> >and ascii 'dump' mode, the
> >better to locate where the corruption is located.
> >
> >peter
> >edinburgh
> >...................................
> >
> >Source as follows:
> >
> >
> >Function BAD_ASCII
> > (V_Text in char)
> > return number
> >is
> > V_Int number;
> > V_Count number;
> >begin
> >--
> >V_Int := 0;
> >V_Count := 1;
> >while V_Count<=length(rtrim(V_Text)) and V_Int=0
> > loop
> > if ascii(substr(V_Text, V_Count, 1))<32 then
> > V_Int := V_Count;
> > end if;
> > V_Count := V_Count + 1;
> >end loop;
> >return V_Int;
> >--
> >exception
> > when others then
> > return -1;
> >end BAD_ASCII;
> >/
> >
>
> Peter,
>
> I think that you can make this code 25% faster when the data is
> clean (which hopefully is the general case) by using 'replace', more
> efficient than a PL/SQL loop, to check whether you have some rubbish
> (sort of). It will not tell you where the bad character is, however -
> which means that then you can loop to look for it.
>
> Here is what I would suggest :
>
> create or replace Function BAD_ASCII (V_Text in char)
> return number
> is
> V_Int number;
> V_Count number;
> begin
> if (replace(V_text, chr(0)||chr(1)||chr(2)||chr(3)||
> chr(4)||chr(5)||chr(6)||chr(7)||
> chr(8)||chr(9)||chr(10)||chr(11)||
> chr(12)||chr(13)||chr(14)||chr(15)||
> chr(16)||chr(17)||chr(18)||chr(19)||
> chr(20)||chr(21)||chr(22)||chr(23)||
> chr(24)||chr(25)||chr(26)||chr(27)||
> chr(28)||chr(29)||chr(30)||chr(31),
> '--------------------------------')
> = V_text)
> then
> return 0;
> else
> V_Int := 0;
> V_Count := 1;
> while V_Count<=length(rtrim(V_Text)) and V_Int=0
> loop
> if ascii(substr(V_Text, V_Count, 1))<32 then
> V_Int := V_Count;
> end if;
> V_Count := V_Count + 1;
> end loop;
> return V_Int;
> end if;
> --
> exception
> when others then
> return -1;
> end BAD_ASCII;
> /
>
> Regards,
>
> Stephane Faroult
> Oriole
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: Stephane Faroult
> INET: sfaroult_at_oriolecorp.com
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).

-- 
Mladen Gogala
Oracle DBA




Note:
This message is for the named person's use only.  It may contain confidential, proprietary or legally privileged information.  No confidentiality or privilege is waived or lost by any mistransmission.  If you receive this message in error, please immediately delete it and all copies of it from your system, destroy any hard copies of it and notify the sender.  You must not, directly or indirectly, use, disclose, distribute, print, or copy any part of this message if you are not the intended recipient. Wang Trading LLC and any of its subsidiaries each reserve the right to monitor all e-mail communications through its networks.
Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity.

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Mladen Gogala
  INET: mladen_at_wangtrading.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Fri Oct 10 2003 - 15:44:26 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US