Home » SQL & PL/SQL » SQL & PL/SQL » Parsing BLOB Contents (11g)
icon5.gif  Parsing BLOB Contents [message #580845] Thu, 28 March 2013 10:27 Go to next message
SerenityNetworks
Messages: 8
Registered: February 2013
Location: Richardson, TX
Junior Member
I have some BLOB contents in the format of...
DK99F17,AA,032820130840,Other
ABCD,AA,032820130840,OV
AAZ123,BC,032820130932,DWL
CBA12345,ZA,032820130939,Other

Each BLOB is associated to a file name in the format...
03282013100002_thisfile.txt

The blob for each file may be zero rows to n rows in size, but typically there are 2 to 5 rows (four rows were shown in the rows above).

The following kind of gets me there, but not quite as it splits up the BLOB rows at the comma and not the line break (HEX=0D0A / CRLF).
with rec as 
   (select fs.file_name, utl_raw.cast_to_varchar2(fs.file_data) file_data
     from tada.files_store fs
    where fs.file_name like '%citations.txt'
      and trunc(fs.date_created) = to_date('26-MAR-2013','DD-MON-YYYY'))
select rec.file_name, REGEXP_SUBSTR(file_data, '[^,]+', 1, level)
from rec
connect by level <= length(regexp_replace(file_data,'[^,]*'))+1;

The above code returns...
03282013100002_thisfile.txt	DK99F17
03282013100002_thisfile.txt	AA
03282013100002_thisfile.txt	032820130840
03282013100002_thisfile.txt	Other
03282013100002_thisfile.txt	ABCD
03282013100002_thisfile.txt	AA
...and so on...

What I'd like is...
03282013100002_thisfile.txt	DK99F17,AA,032820130840,Other
03282013100002_thisfile.txt	ABCD,AA,032820130840,OV
03282013100002_thisfile.txt	AAZ123,BC,032820130932,DWL
03282013100002_thisfile.txt	CBA12345,ZA,032820130939,Other
...and then on to the next file...

I've started down the path below, but I can't get it right.
For rec in (
           select fs.file_name, utl_raw.cast_to_varchar2(fs.file_data) file_data
           from tada.files_store fs
           where 1=1
           AND   fs.file_name LIKE '%citations.txt'
           AND   trunc(fs.date_created) = to_date('26-MAR-2013','DD-MON-YYYY')   
           --order by fs.file_name
         ) loop
  While file_data not end loop
    Parse file_data first item
    Put_line ( utl_raw.cast_to_varchar2(fs.file_data) || ',' || item)
    Reposition contents to next item
  End loop
End (cursor);

I'd appreciate whatever guidance can be provided.

Thanks,
Andrew
Re: Parsing BLOB Contents [message #580847 is a reply to message #580845] Thu, 28 March 2013 10:48 Go to previous messageGo to next message
_jum
Messages: 490
Registered: February 2008
Senior Member
To solve the problem to split the string at CR/LF you can start with:
WITH rec AS
  (SELECT 
'DK99F17,AA,032820130840,Other
ABCD,AA,032820130840,OV
AAZ123,BC,032820130932,DWL
CBA12345,ZA,032820130939,Other' file_data 
     FROM dual)
 SELECT REGEXP_SUBSTR(file_data, '.+', 1, level,'m') rstr
    FROM rec
CONNECT BY level <= length(regexp_replace(file_data,'.+'))+1;      

RSTR
--------------------------
DK99F17,AA,032820130840,Other
ABCD,AA,032820130840,OV
AAZ123,BC,032820130932,DWL
CBA12345,ZA,032820130939,Other

In your REGEXP_SUBSTR you merge '[^,]+', q.e. all chars except of ','.
So you split at ','.
Re: Parsing BLOB Contents [message #580854 is a reply to message #580847] Thu, 28 March 2013 11:42 Go to previous messageGo to next message
SerenityNetworks
Messages: 8
Registered: February 2013
Location: Richardson, TX
Junior Member
Thank you! I'm beginning to understand, but still a bit lost (this is beyond my level of ability).

When I run...
select fs.file_name, utl_raw.cast_to_varchar2(fs.file_data) file_data
     from tada.files_store fs
    where fs.file_name like '%thisfile.txt'
      and trunc(fs.date_created) = to_date('26-MAR-2013','DD-MON-YYYY')

It gets me two BLOBS...
03262013100003_thisfile.txt	"GHI9180,TT,032620130853,OV  CRLF  OPQR4321,GA,032620130859,Other  CRLF"
03262013090155_thisfile.txt	"DK99F17,AA,032820130840,Other  CRLF  ABCD,AA,032820130840,OV  CRLF  AAZ123,BC,032820130932,DWL  CRLF"

With the following...
with rec as 
   (select fs.file_name, utl_raw.cast_to_varchar2(fs.file_data) file_data
     from tada.files_store fs
    where fs.file_name like '%thisfile.txt'
      and trunc(fs.date_created) = to_date('26-MAR-2013','DD-MON-YYYY'))
SELECT REGEXP_SUBSTR(file_data, '.+', 1, level,'m') rstr
    FROM rec
CONNECT BY level <= length(regexp_replace(file_data,'.+'))+1;


It gets me...
RSTR
"GHI9180,TT,032620130853,OV"
"OPQR4321,GA,032620130859,Other"


"AAZ123,BC,032820130932,DWL"

"ABCD,AA,032820130840,OV"


"AAZ123,BC,032820130932,DWL"

"DK99F17,AA,032820130840,Other"
"OPQR4321,GA,032620130859,Other"


"AAZ123,BC,032820130932,DWL"

"ABCD,AA,032820130840,OV"


"AAZ123,BC,032820130932,DWL"

Where I still need to return just...
03262013100003_thisfile.txt	GHI9180,TT,032620130853,OV
03262013100003_thisfile.txt	OPQR4321,GA,032620130859,Other
03262013090155_thisfile.txt	DK99F17,AA,032820130840,Other
03262013090155_thisfile.txt	ABCD,AA,032820130840,OV
03262013090155_thisfile.txt	AAZ123,BC,032820130932,DWL

I'm pretty clueless as how to get there.

Thanks again,
Andrew
Re: Parsing BLOB Contents [message #580860 is a reply to message #580854] Thu, 28 March 2013 14:16 Go to previous messageGo to next message
SerenityNetworks
Messages: 8
Registered: February 2013
Location: Richardson, TX
Junior Member
Got it, kind of...

This gets it, but I don't know how to get rid of the extra line feeds that I showed in the example just above.
WITH rec AS 
   (SELECT fs.file_name, utl_raw.cast_to_varchar2(fs.file_data) file_data
    FROM tada.files_store fs
    WHERE fs.file_name LIKE '%thisfile.txt'
    AND TRUNC(fs.date_created) = to_date('26-MAR-2013','DD-MON-YYYY'))
SELECT rec.file_name, REGEXP_SUBSTR(file_data, '.+', 1, LEVEL)
FROM rec
CONNECT BY LEVEL <= LENGTH(regexp_replace(file_data,'.+'))+1;
Re: Parsing BLOB Contents [message #580866 is a reply to message #580860] Thu, 28 March 2013 15:41 Go to previous message
SerenityNetworks
Messages: 8
Registered: February 2013
Location: Richardson, TX
Junior Member
A colleage was able to wrap it up...
SELECT DISTINCT a.file_name, a.file_data
FROM
  (
  WITH rec AS 
     (SELECT fs.file_name, utl_raw.cast_to_varchar2(fs.file_data) file_data
      FROM tada.files_store fs
      WHERE fs.file_name LIKE 'myfile.txt'
      AND TRUNC(fs.date_created) = to_date('26-MAR-2013','DD-MON-YYYY')
    )
SELECT rec.file_name, 
       REGEXP_SUBSTR(file_data, '.+', 1, LEVEL) file_data
FROM rec
CONNECT BY LEVEL < LENGTH(regexp_replace(file_data,'.+'))+1
ORDER BY rec.file_name, level) a;


Thanks for the heads up on the regexp_substr syntax.

Regards,
Andrew
Previous Topic: row to column
Next Topic: Create tabled of type Varray
Goto Forum:
  


Current Time: Fri Aug 29 13:12:47 CDT 2014

Total time taken to generate the page: 0.12088 seconds