Re: Intermedia

From: Thomas Hesse <thomas.hesse_at_team.xtra.co.nz>
Date: Wed, 16 Jul 2003 08:21:15 +1200
Message-ID: <3F14623B.60205_at_team.xtra.co.nz>

Hi Ludwig,

have a look at technet.oracle.com -> products -> Oracle Text there are some performance faq's which may help you
for queries like '%token%' there are only 2 possibilities:
1. do not allow them
2. you can add a substring index: Ctx_Ddl.Set_Attribute( 'my_wordlist', 'substring_index', 'true'); which will add another internal table ($P) which contains data like:

select pat_part1, pat_part2 from dr$quick_text$p

order by length(pat_part2) desc;

PAT_PART1 PAT_PART2
---------- ----------
PHENOMENON
P HENOMENON
PH ENOMENON
PHE NOMENON
PHEN OMENON
PHENO MENON
PHENOM ENON
PHENOME NON

Oracle will then convert a search with '%token%' to 'token' and search withing that $P table.
BUT this $P table will be huge (about #rows($I table) * avg(length(token))

Thomas

Ludwig Maier wrote:
> hi folks!
>
> i've got the problem to maintain a oracle-text index wich is about 20gig
> big. what kind of packages or tool should i use to reorganize this index and
> how can i look how many waste there's in my index?
> the main problem is the performance when searching documents with wildcard
> on the lefht side (like "*test*"), do you have any suggsestions on this?
> thanks in advance
Received on Tue Jul 15 2003 - 22:21:15 CEST

This message: [ Message body ]
Next message: Syltrem: "Re: Sure-fire "kill""
Previous message: Daniel Morgan: "Re: Sure-fire "kill""
In reply to Ludwig Maier: "Intermedia"
Next in thread: Anton Buijs: "Re: Intermedia"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message