Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Re: Oracle Text Design Question
Hi Kevin,
"Kevin Crosbie" <caoimhinocrosbai_at_yahoo.com> wrote
> Doing this now... also rebuilding the index before hand, just to be sure.
> I've taken an export into another schema so I can play with a subset of my
> data (2 Million rows).
>
Please note that the statistics at table level ARE NOT enough. This is
because Oracle Text indexes generate lot of auxiliary tables which are
queried MANY times under the hood before or even during your query
execution, the explain plan of the originating query will not include
this. I found out by getting exclusive access to my DB, monitoring
all queries and running the Oracle TEXT query, this will evidence
all the tiny queries which to me seems are also improved by computing
statistics on them.
> I'll try this suggestion out next and let you know how it goes.
>
> It just seems strange to me, I guess I haven't really grasped the overall
> concept of how text indexes work.
> From what I've read in this doc:
> http://www.oracle.com/technology/products/text/x/FAQs/imt_Perf_Faq.html it
> seems that it is designed to work great for single table queries,
especially
> when restricted by rownum.
> What doesn't seem clear to me is how it can work efficiently when joining
to
> other tables or making a query more complex by adding more clauses.
>
I have an application which uses the most expensive features of Oracle
text weighted fuzzy and wildcard with boolean logic, single query, many
contain clauses, 7+ tables joined to and with the table indexed, tested
over more than 300K rows and I get response time ALWAYS lower than a sec.
The major difference we have is that the column where I have the context index on is VARCHAR2 and not CLOB but I can also bet it is not the issue here ... I also had serious performance troubles with Oracle Text even using VARCHAR2 as indexed column.
e.g. 3 contain clauses here ...
select /*+ NOPARALLEL(ft1) */ to_char(ft1.entity_id) as id
, to_char(round(ft1.score)) as score , ft1.bestmatch , ft1.name_last as lastname , ft1.name_first as firstname , ft1.category , ft1.entry_type as entrytype , to_char(ft1.country_id) as country_id , ft1.ctry_name_english as country , ft1.list_abbreviations as source , to_char(ft1.list_readonly) as readonlyfrom (SELECT /*+ NOPARALLEL(t1) NOPARALLEL(t2) FIRST_ROWS DOMAIN_INDEX_NO_SORT */
t2.entity_id , t2.real_name_first as name_first , t2.real_name_last as name_last , t2.bestmatch , t2.enty_category as category
, t2.ctry_name_english , t2.list_readonly , t2.list_abbreviations , round(( greatest(score_1*decode(occurs_last_flag_1, 1, 10.0, 0)/greatest(1, CASE WHEN c_name_last_wc >= 3 THEN c_name_last_wc ELSE 3 - c_name_first_wc END , c_name_last_wc)
name_id , max(decode( token_id, 1, score , 0))
score_1
, max(decode( token_id, 1, occurs_last_flag, 0)) occurs_last_flag_1
, max(decode( token_id, 1, decode(occurs_last_flag, 1, 0,
occurs_first_flag), 0)) occurs_first_flag_1
, max(decode( token_id, 2, score , 0)) score_2 , max(decode( token_id, 2, occurs_last_flag, 0)) occurs_last_flag_2 , max(decode( token_id, 2, decode(occurs_last_flag, 2, 0, occurs_first_flag), 0)) occurs_first_flag_2 , max(decode( token_id, 3, score , 0)) score_3, max(decode( token_id, 3, occurs_last_flag, 0)) occurs_last_flag_3 , max(decode( token_id, 3, decode(occurs_last_flag, 3, 0, occurs_first_flag), 0)) occurs_first_flag_3
FROM ( SELECT term_id , max(score) as score , max(token_id) as token_id FROM ( select /*+ NOPARALLEL(T_NLM_TERM) FIRST_ROWS DOMAIN_INDEX_NO_SORT */ term_id , (100*scoreLevenshtein(TERM, 'OSAMA')) as score , 1 as token_id
term_id , (100*scoreLevenshtein(TERM, 'BIEN')) as score , 2 as token_id
term_id , (100*scoreLevenshtein(TERM, 'LADIN')) as score , 3 as token_id
GROUP BY term_id ) t1 , T_NLM_SEARCHTERM_REL t2 WHERE t1.term_id=t2.term_id GROUP BY name_id ) t1 , ( SELECT /*+ NOPARALLEL(t2) */ t2.* FROM T_NLM_SEARCH t2 ) t2