Re: Oracle Text / Office 2007 question / 10gR2

From: hpuxrac <johnbhurley_at_sbcglobal.net>
Date: Sat, 15 Mar 2008 07:11:25 -0700 (PDT)
Message-ID: <f2a65b31-2050-49cc-b77c-d6ad0125efb8@d62g2000hsf.googlegroups.com>


On Mar 15, 9:56 am, jeremy <jeremy0..._at_gmail.com> wrote:
> In article <cd783b9f-7cf9-4d60-9b5d-d616eb0637d4
> @x30g2000hsd.googlegroups.com>, hpuxrac says...
>
>
>
>
>
> > On Mar 15, 5:50 am, jeremy <jeremy0..._at_gmail.com> wrote:
> > > Hi
>
> > > As I understand it Oracle Text in 10gR2 is not able to text index .docx
> > > files generated by MS Office 2007. As the use of this format is only
> > > going to increase (and we have to allow for this type of file) have any
> > > of you come across this problem and did you devise a workaround for it?
>
> > > Our application accepts CVs from candidates each of which has to be
> > > indexed.
>
> > > We are running on RHEL4 / 10gR2.
>
> > > cheers
>
> > > --
> > > jeremy          
>
> > Save as xml?
>
> If this can be done within Oracle then yes - how do we do it? If not
> then it's a bit impractical - candidates will attach their CV document
> wont want to faff around with saving in different versions.

Why not design your app so that whatever format the input "is submitted in" you convert it into a standard one before trying to add it ( and index it ) inside oracle?

>
> I understand that docx is actually "zipped" xml - so also wondering if
> there is anything in Oracle 10g that might support "unzipping" a BLOB?
>

Have you looked at the documentation? Received on Sat Mar 15 2008 - 09:11:25 CDT

Original text of this message