Re: Oracle Text Indexes on WORD and pdf files

From: zigzagdna <zigzagdna_at_yahoo.com>
Date: Sun, 9 May 2010 13:42:18 -0700 (PDT)
Message-ID: <2bc46705-6097-412b-b8d4-557ce5c18c80_at_37g2000yqm.googlegroups.com>



On May 9, 1:42 pm, "Vladimir M. Zakharychev" <vladimir.zakharyc..._at_gmail.com> wrote:
> On May 9, 7:14 am, zigzagdna <zigzag..._at_yahoo.com> wrote:
>
> > I am on Oracle 11 g using hp unix 11i.
> > Can one set Oracle text Indexes on Microsoft WORD and  pdf files. I
> > can understand text indexes on text files, but how do test indexes
> > work on “binary” files such as WORD and pdf files. I have been told
> > they work on these files as well, just curious how oracle manages to
> > parse such files.
>
> They are using external converters to parse them and convert them to
> HTML or XML. Extproc functionality is heavily used for that. Since
> file formats evolve over time so do these external filters, so if
> you're on 11 you better use filters native to this version (actually I
> can't see how you could use 9i filters with 11 anyway.)
>
> Regards,
>    Vladimir M. Zakharychev
>    N-Networks, makers of Dynamic PSP(tm)
>    http://www.dynamicpsp.com

Vladamir:
Thanks a lot. Oracle 11g still provides older Oracle Text, that's what I installed inside Oracle 11g database instance, Received on Sun May 09 2010 - 15:42:18 CDT

Original text of this message