Re: Oracle Text / Office 2007 question / 10gR2

From: Frank van Bortel <frank.van.bortel_at_gmail.com>
Date: Mon, 17 Mar 2008 21:24:40 +0100
Message-ID: <95$47ded389$524b5c40$5724@cache3.tilbu1.nb.home.nl>


BicycleRepairman wrote:
> Ouch! You're right there, being skewered on the horns of a dilemma.
> Oracle licenses the filters: in 10g and previous using Stellent
> (INSO), and using Autonomy (Verity) in 11g.

There was a switch between 10G Rel1 and Rel2.

>[snip!]

> So, if I was in your shoes, I'd either use automation to convert
> the .docx to PDF, then insert the PDF, or use automation to crack open
> the .docx file (actually, a zipped set of XML files) and insert/index
> the document.xml file. The first path is easy and not complicated, but
> you'll need Word 2007 to perform its magic. The second path is a lot
> more risky, but it would be interesting to see how difficult it is to
> handle. In terms of being able to index/search for keywords, it might
> not be that bad at all.

You do realize, PDF's are indexed without problem? I fail to see the benefit of "cracking open" the xml, once you have the PDF
-- 

Regards,
Frank van Bortel

Top-posting in UseNet newsgroups is one way to shut me up
Received on Mon Mar 17 2008 - 15:24:40 CDT

Original text of this message