Re: Intermedia

From: Anton Buijs <remove_aammbuijs_at_xs4all.nl>
Date: Tue, 15 Jul 2003 22:44:56 +0200
Message-ID: <3f1467b1$0$49117$e4fe514c_at_news.xs4all.nl>


[Quoted] Ludwig Maier <dj_nail_at_gmx.net> schreef in berichtnieuws 3f13ea6e$0$20558$91cee783_at_newsreader01.highway.telekom.at...
| hi folks!
|
| i've got the problem to maintain a oracle-text index wich is about 20gig
| big. what kind of packages or tool should i use to reorganize this index
and
| how can i look how many waste there's in my index?
| the main problem is the performance when searching documents with wildcard
| on the lefht side (like "*test*"), do you have any suggsestions on this?
| thanks in advance
|

I can't say if any of this is applicable for your situation but I had the next experience.
First: we talk about Intermedia Text (formally called Context) indexes, don't we, not conventional B-tree indexes.

We found the way you analyze the table with the context index has big impact.
Searches with a text of 2 positions with a wildcard ("xx%") could easily run 2 hours with millions of consistent gets. It was the database behind a public website. When 3 searches of this kind where running, the machine was overloaded and the website came almost to a halt. The statistics where gathered with "analyze table compute statistics", so all indexes are automatically included. Then I did

   analyze table ... delete statistics;
   analyze table ... compute statistics for table only;
   analyze index ... compute statistics;     for each index BUT NOT the
context index.
Searches ended in seconds, with far less consistent gets ofcourse. Problem solved.
The user_indexes.last_analyzed date was always null for the context index. Ofcourse. The information stored in the dictionary is more like a "placeholder" for the index definition. For a context index there is no real index in the sense of a b-tree but a set of DR$- tables that stores the tokens etc. etc. It is impossible to gather statistics for a context index. But in some way it had impact. We reported this to OTS ofcourse but since we had a workarond they did not spent much time on it. Maybe also because it was on version 8.0.6 on Solaris (still supported that time). Now the database runs V8.1.7.3 and I never tried the old, bad performing setup, why should I?
For the sake of completeness: conform the advise of Oracle the DR$-tables must not be analyzed.

There is a need to optimize the context indexes on a regular basis (with ctx_ddl.optimize_index(.....)). We do a fast once a week. There is a full and a fast optimze. Only the full optimize deletes tokens from the context index for delete rows of the table.
A full optimize can run a long time, specially the first time after context index creation and generates A LOT of redo, so if the db is in archive log mode make sure there is sufficient disk space for the archives. I have no idea how to find the wasted space (in the DR$-tables) of a context index.
Maybe as you do with a regular table: analyze it, just for the sake of knowing avg_row_len, count the rows, add the overhead and compare that to the table size. Received on Tue Jul 15 2003 - 22:44:56 CEST

Original text of this message