Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Intermedia index

Re: Intermedia index

From: Vladimir M. Zakharychev <bob_at_dpsp-yes.com>
Date: Thu, 28 Feb 2002 20:02:04 +0300
Message-ID: <a5lnlc$5jb$1@babylon.agtel.net>


create index ... indextype is ctxsys.context defaults to null_filter unless you redefined the default filter to apply. null_filter is only good on text documents. PDFs are basically not such (they use compressed streams to save space). Filtering PDFs with null_filter is likely to produce an unusable index. Create the index with

create index ... indextype is ctxsys.context parameters('filter inso_filter nopopulate');

which will complete almost immediately thanks to NOPOPULATE parameter and then populate it with a background ctx_ddl.sync_index call. The index will be queriable while being synchronized (unlike the case when you create and populate with the same create index statement - in this case it will be marked LOADING until all documents are indexed).
You may also want to tweak some preferences for the index (like disabling theme indexing if you only intend to search for keywords which will greatly improve indexing and search performance) before you create it. And be sure to apply the latest patchset to your Oracle server (as always ;)

-- 
Vladimir Zakharychev (bob@dpsp-yes.com)                http://www.dpsp-yes.com
Dynamic PSP(tm) - the first true RAD toolkit for Oracle-based internet applications.
All opinions are mine and do not necessarily go in line with those of my employer.


"Olivier VEIT" <oveit_at_infeurope.lu> wrote in message news:a5ljac$l55$1_at_wanadoo.fr...

> Hello,
>
> I've a column PDF_NOTICE_LB BLOB in my table ATTACH containing PDF files
> which contains text in FR, DE, IT and EN.
> To index this column, i simply execute :
> create index CTX_ATTACH_PDF_NOTICE_LB on ATTACH(PDF_NOTICE_LB) indextype is
> ctxsys.context;
>
> This command is running but is very slow (45 min for indexing to PDF having
> each 170 Kb). But after indexing, i can't find anything then i use :
> select attach_id from attach where contains(PDF_NOTICE_LB,'test') >0;
>
> If I try to index a lot of pdf file, oracle uses 99% processor but seems to
> do nothing.
> It seems that the inso filter are OK (I used the ctxhx utility on one of my
> pdf document to test)
>
> Has someone an idea wich could help me...or better : a solution ;-)
>
> FYI : I'am using Oracle 8i (8.1.7) with Linux
>
> Thank you very much
>
>
Received on Thu Feb 28 2002 - 11:02:04 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US