Re: Particular file (PDF) cannot be loaded into Oracle BLOB. Makes sense ?

From: BicycleRepairman <engel.kevin_at_gmail.com>
Date: Sat, 19 Mar 2011 08:21:13 -0700 (PDT)
Message-ID: <fdb33f98-aa55-4ee0-a648-0d27e3d2e787_at_t13g2000vbo.googlegroups.com>



On Mar 17, 8:34 am, "Syltrem" <syltremz..._at_videotron.ca> wrote:
> "Steve Howard" <stevedhow..._at_gmail.com> wrote in message
>
> news:8c80cc88-e48f-4c84-8023-856244e660b6_at_r19g2000prm.googlegroups.com...
> On Mar 15, 1:36 pm, "Syltrem" <syltremz..._at_videotron.ca> wrote:
>
> > Hello
>
> > I have a particular set of PDF files that cannot be loaded successfully
> > into
> > an Oracle BLOB column.
>
> >Do you have an actual error message?
>
> No error. It's just that some extra characters are added into the BLOB.
> For instance, I see that a 0A is added following each 0C (LineFeed added
> following each FormFeed).
> So if the PDF contains 50 FormFeeds, then 50 LineFeeds are added, making 50
> characters to be truncated from the end of the file (the BLOB length always
> matches the file's length).
>
> And the problem is with INSERTING the file into the BLOB. The extraction
> from Oracle is not the problem, which I can prove by loading the same PDF
> with a dot net program, and extracting it successfully with the same
> extraction procedure. dbms_lob.loadBLOBfrofile does the damage.
>
> This is with 10.2.0.4
>
> Yesterday, I downloaded Acrobat Pro trial version. I created a new PDF file
> with it (from a MS-Word document), loaded it into Oracle, and could not
> retrieve it in good condition. I have not gone into the details yet but I
> suspect the same problem.
> Strangely, I have other PDF coming from different sources, that work without
> problem. Actually, I was surprised that I could not use a PLSQL program that
> I used for a couple of years to load other PDFs, to load these new ones.
> Can't tell what Oracle does not like about these new ones.
>
> And yes I see there are many example out there showing how to load and
> extract PDF with Oracle, but non mentions that some PDF (nor any other file
> types) are not "supported".
>
> Thanks
> Syltrem

I had this happen a couple of years ago on one system... couldn't be reproduced in our dev and test environments; the final result was a character set conversion issue, where the (thick client) software we were using instructed OCI to set the characterset to UTF-8 (AL32UTF8) but sql*net was not making the character set conversion properly and was therefore storing the blob in a different character set than what we all "knew" was the character set we were using. It took *months* to trace it down (because no one believed that the thick client executable (commercial software, in use for a decade) could possibly be at fault.
So -- as a test -- use something like sql*developer to load the PDF and see in what environments the resulting file can be viewed. In our case, a .Net application could see the sql*developer inserted file, but our thick client app could not. Received on Sat Mar 19 2011 - 10:21:13 CDT

Original text of this message