Re: Intermedia

From: Thomas Hesse <thomas.hesse_at_team.xtra.co.nz>
Date: Wed, 16 Jul 2003 08:21:15 +1200
Message-ID: <3F14623B.60205_at_team.xtra.co.nz>


Hi Ludwig,

  • have a look at technet.oracle.com -> products -> Oracle Text there are some performance faq's which may help you
  • for queries like '%token%' there are only 2 possibilities:
    1. do not allow them
    2. you can add a substring index: Ctx_Ddl.Set_Attribute( 'my_wordlist', 'substring_index', 'true'); which will add another internal table ($P) which contains data like:

select pat_part1, pat_part2 from dr$quick_text$p

   order by length(pat_part2) desc;

  • PAT_PART1 PAT_PART2
  • ---------- ----------
  • PHENOMENON
  • P HENOMENON
  • PH ENOMENON
  • PHE NOMENON
  • PHEN OMENON
  • PHENO MENON
  • PHENOM ENON
  • PHENOME NON
Oracle will then convert a search with '%token%' to 'token' and search withing that $P table.
BUT this $P table will be huge (about #rows($I table) * avg(length(token))

Thomas

Ludwig Maier wrote:
> hi folks!
>
> i've got the problem to maintain a oracle-text index wich is about 20gig
> big. what kind of packages or tool should i use to reorganize this index and
> how can i look how many waste there's in my index?
> the main problem is the performance when searching documents with wildcard
> on the lefht side (like "*test*"), do you have any suggsestions on this?
> thanks in advance
Received on Tue Jul 15 2003 - 22:21:15 CEST

Original text of this message