Re: Creating Oracle Text Indexes

From: SEJ <sejprv_at_hotmail.com>
Date: Thu, 5 Mar 2009 19:36:02 +0100
Message-ID: <49b01bca$0$90276$14726298_at_news.sunsite.dk>


"Jeremy" <jeremy0505_at_gmail.com> skrev i en meddelelse news:MPG.24191d206a824e229896eb_at_News.Individual.NET...
> Hi this is on 10gR2 - we have a table with around 240,000 documents
> (word, pdf, html).
>
> We have a text index created on it using the following settings
> (generated by executing the procedure):
>
> ---------------------------------------------------------
>
> create or replace procedure wd_rebuild_cv_text_index_blob
> (p_tablespace in varchar2 default 'WD_I')
> is
>
> begin
>
> begin
> execute immediate 'begin ctx_ddl.drop_preference
> (''wd_cand_doc_lob_store''); end; ';
> exception
> when others then
> null;
> end;
>
> begin
> execute immediate 'drop index wd_cand_doc_lob_text';
> exception
> when others then
> null;
> end;
>
> execute immediate 'begin ctx_ddl.create_preference
> (''wd_cand_doc_lob_store'', ''BASIC_STORAGE''); end; ';
>
> execute immediate 'begin ctx_ddl.set_attribute
> (''wd_cand_doc_lob_store'', ''I_TABLE_CLAUSE'',
> ''tablespace '||p_tablespace||' storage (initial
> 500m next 10m maxextents 200)''); end; ';
> execute immediate 'begin ctx_ddl.set_attribute
> (''wd_cand_doc_lob_store'', ''K_TABLE_CLAUSE'',
> ''tablespace '||p_tablespace||' storage (initial
> 1K)''); end; ';
> execute immediate 'begin ctx_ddl.set_attribute
> (''wd_cand_doc_lob_store'', ''R_TABLE_CLAUSE'',
> ''tablespace '||p_tablespace||' storage (initial
> 1K)''); end;';
> execute immediate 'begin ctx_ddl.set_attribute
> (''wd_cand_doc_lob_store'', ''N_TABLE_CLAUSE'',
> ''tablespace '||p_tablespace||' storage (initial
> 1K)''); end;';
> execute immediate 'begin ctx_ddl.set_attribute
> (''wd_cand_doc_lob_store'', ''I_INDEX_CLAUSE'',
> ''tablespace '||p_tablespace||' storage (initial
> 100m next 10m maxextents 500)''); end; ';
> execute immediate 'create index wd_cand_doc_lob_text
> on wd_cand_doc_lob (doc_blob)
> indextype is ctxsys.context
> parameters ( ''TRANSACTIONAL SYNC(EVERY "SYSDATE+
> 15/1440") storage wd_cand_doc_lob_store'' )';
> exception
> when others then
> dbms_output.put_line('error');
> dbms_output.put_line(sqlerrm(sqlcode));
> end;
> /
> sho err
> ---------------------------------------------------------
>
> We have run
>
> SELECT AVG(COUNT(*))
> FROM DR$wd_web_pages_imt$I
> GROUP BY TOKEN_TEXT HAVING COUNT(*) > 1;
>
> The documentation says anything beyond 10 would be an indication that
> you should refresh the index. On the database the score is a staggering
> 31.8.
>
> Using ctx_report.index_stats we see:
>
> indexed documents: 234,324
> allocated docids: 238,845
> $I rows: 16,635,689
>
> total size of $I data: 371,884,061 (354.66 MB)
>
> $I rows: 16,635,689
> estimated $I rows if optimal: 1,976,853
> estimated row fragmentation: 88 %
>
> garbage docids: 4,521
> estimated garbage size: 5,842,571 (5.57 MB)
>
>
> now this does not look good.
>
>
> Index query performance is extremely poor.
>
> Could anyone suggest some "best practise" for how we might be able to
> set the various preferences to create the indexes with minimal
> fragmentation?
>
> The server on which we are running this is running Linux RHEL4 with 4GB
> RAM.
>
>
>
> --
> jeremy

hi Jeremy

You should create a db job and optimize your index. There are several ways to do that, try google text dbms_job optimize and you will find some answers.

Oracle has some undocumentet unsupporteded features !? in older version you could optimize the index online in standard edt.

Never drop/recrate in production your maxmem size will never could do it better than optimize.
In our environment dr$ .. $i sometimes contains 100 mill records (poor ocr) And text/intermedia isent google when you combine with your master table.

In our application we have a max hit value beside value for wildcard maxterm,

exec ctx_ddl.optimize_index('ifs_text', 'FAST');

exec ctx_ddl.optimize_index('ifs_text', 'FULL', maxtime);

regards SEJ Received on Thu Mar 05 2009 - 12:36:02 CST

Original text of this message