Oracle text question - marking-up HTML documents

From: jeremy <>
Date: Wed, 28 May 2008 06:25:32 -0700 (PDT)
Message-ID: <>


This is a problem on 10gR2 (Windows 2000 server).

We have a table with a CLOB containing HTML documents.

We have created a text inded on it:

create index my_text_index
  on my_tab
  indextype is ctxsys.context
  parallel (degree 1)
  parameters ('filter ctxsys.null_filter section group ctxsys.html_section_group');

Text searches work as expected but when displaying the marked-up result document we are getting the wrong terms highlighted. For example using this:

                 textkey   =>to_char(id),
                 restab    =>mklob,
                 tagset    => 'HTML_DEFAULT',
	         starttag  =>'<font size=+1 color="red"> <b>',
		 endtag    =>'</b></font>');

Results in:

<TD valign=top class=icams-field-prompt>What was the primary
business<font size=+1 color="red"> <b> of your</b></font> employer?</ TD>

<TD valign=top class=field-text>99</TD>
<TD valign=top class=field-prompt>Duration of employment (months)</TD>
<TD valign=top class=field-text>3</TD>

You will see that CTX_DOC.MARKUP has highlighted the words "of your" as opposed to the word "duration".

If I simply change the call to CTX_DOC.MARKUP to use the additonal parameter
plaintext => false
then the markup is correct (though of course without any formatting).

Has anyone come across this behaviour before and what are we doing wrong?

many thanks

Received on Wed May 28 2008 - 08:25:32 CDT

