Re: Oracle Text Indexes on WORD and pdf files

From: zigzagdna <>
Date: Sun, 9 May 2010 13:42:18 -0700 (PDT)
Message-ID: <>

On May 9, 1:42 pm, "Vladimir M. Zakharychev" <> wrote:
> On May 9, 7:14 am, zigzagdna <> wrote:
> > I am on Oracle 11 g using hp unix 11i.
> > Can one set Oracle text Indexes on Microsoft WORD and  pdf files. I
> > can understand text indexes on text files, but how do test indexes
> > work on “binary” files such as WORD and pdf files. I have been told
> > they work on these files as well, just curious how oracle manages to
> > parse such files.
> They are using external converters to parse them and convert them to
> HTML or XML. Extproc functionality is heavily used for that. Since
> file formats evolve over time so do these external filters, so if
> you're on 11 you better use filters native to this version (actually I
> can't see how you could use 9i filters with 11 anyway.)
> Regards,
>    Vladimir M. Zakharychev
>    N-Networks, makers of Dynamic PSP(tm)

Thanks a lot. Oracle 11g still provides older Oracle Text, that's what I installed inside Oracle 11g database instance,

