Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Intermedia index

Re: Intermedia index

From: Olivier VEIT <oveit_at_infeurope.lu>
Date: Fri, 1 Mar 2002 11:38:29 +0100
Message-ID: <a5nlg0$hgn$1@wanadoo.fr>


Hello,

Thank you very much for your help. Using your informations, i wrote this :

begin

 ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER');
 ctx_ddl.set_attribute('my_lexer', 'INDEX_TEXT', 'YES');
 ctx_ddl.set_attribute('my_lexer', 'INDEX_THEMES', 'NO');
 end;
 /

PROMPT index creation...;
create index ctx_attach_notice_pdf_lb on attach (pdf_notice_lb) indextype is ctxsys.context parameters ('storage users lexer my_lexer filter CTXSYS.INSO_FILTER nopopulate');
indexing was rapid

My problem : I find nothing in column TOKEN_TEXT in table DR$ctx_attach_notice_pdf_lb$I ???

When i don't use the lexer parameter, i find binary data ? in column TOKEN_TEXT in table DR$ctx_attach_notice_pdf_lb$I ??? : TOKEN_TEXT indexing took a lot of time



ÃzH
ÃzHQ
ÃzHQS
ÃzIÃ'
ÃzJ
ÃzJFS
ÃzJFS
ÃzMÃ?HH
ÃzMÃ?HHS
ÃzNDS

Maybe i have a problem with saving my PDF files in BLOB ?

Chris Weiss <chris_at_www.hpdbe.com> a écrit dans le message : a5ljp9$ejn$1_at_msunews.cl.msu.edu...
> Please include the command you used for creating the index. Did you
specify
> filtering in the parameter list? Look at the DR$ tables for the tokens
from
> the index.
>
> When filtering, if you are only interested in keywords, then you should
> create a context preference and turn *OFF* theme indexing. Your indexing
> time will be substantially reduced.
>
> Also, when you establish synchronization policies, you will get much
better
> performance if you can use a DBMS_JOB call no sooner than every 5 minutes
to
> resync your indexes. This depends on your update/insert frequency. If
the
> database is updated slowly, re-indexing once a day might be sufficient.
>
> The insofilter is a resource pig. I recently replaced it with a C program
> for HTML files, the results were a 5x+ speed up in filtering and I
overcame
> some bugs I found in the inso filter. When indexing, the CPU will peg to
> 100%. This is typical behavior.
>
>
> Good Luck!
>
> --
>
> ~~~~~~~~~~~~~~~~
> Chris Weiss
> www.hpdbe.com
> High Performance Database Engineering
> ~~~~~~~~~~~~~~~~
>
>
> "Olivier VEIT" <oveit_at_infeurope.lu> wrote in message
> news:a5ljac$l55$1_at_wanadoo.fr...
> > Hello,
> >
> > I've a column PDF_NOTICE_LB BLOB in my table ATTACH containing PDF files
> > which contains text in FR, DE, IT and EN.
> > To index this column, i simply execute :
> > create index CTX_ATTACH_PDF_NOTICE_LB on ATTACH(PDF_NOTICE_LB) indextype
> is
> > ctxsys.context;
> >
> > This command is running but is very slow (45 min for indexing to PDF
> having
> > each 170 Kb). But after indexing, i can't find anything then i use :
> > select attach_id from attach where contains(PDF_NOTICE_LB,'test') >0;
> >
> > If I try to index a lot of pdf file, oracle uses 99% processor but seems
> to
> > do nothing.
> > It seems that the inso filter are OK (I used the ctxhx utility on one of
> my
> > pdf document to test)
> >
> > Has someone an idea wich could help me...or better : a solution ;-)
> >
> > FYI : I'am using Oracle 8i (8.1.7) with Linux
> >
> > Thank you very much
> >
> >
>
>
Received on Fri Mar 01 2002 - 04:38:29 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US