Re: Help understanding an oracle text search term

From: Bobby Z. <vladimir.zakharychev_at_gmail.com>
Date: Fri, 8 May 2009 02:37:39 -0700 (PDT)
Message-ID: <fea811eb-ac60-44c6-baa9-73b2b376bcba_at_q14g2000vbn.googlegroups.com>



On May 6, 6:47 pm, kcbo..._at_gmail.com wrote:
> On May 6, 3:24 am, "Bobby Z." <vladimir.zakharyc..._at_gmail.com> wrote:
>
>
>
> > On May 6, 12:35 am, kcbo..._at_gmail.com wrote:
>
> > > I have a varchar2 column (col2) and I've created an CTXSYS.CONTEXT
> > > index for it (using the empty stoplist).
> > > I'm running a query that has
> > >    ... AND  CONTAINS(col2, 'G"%') > 0
>
> > > This results in an error "DRG-51030: wildcard query expansion resulted
> > > in too many terms"
> > > But if I change the search term to 'G%' (without the double-quote)
> > > then I don't get the error.
>
> > > I'm trying to understand why the double-quote causes the error and
> > > what it means in a CONTAINS search.
> > > Is a double-quote special in a CONTAINS search?
>
> > > Thanks,
> > > Kevin.
>
> > Oracle version? And yes, double-quotes have special meaning in Text
> > queries, especially for XML-specific HASPATH/INPATH functions. Since
> > double-quote is a punctuation character and as such can not be part of
> > any word, your 'G"%' query is actually meaningless, so no wonder the
> > query parser throws DRG-51030 back at you.
>
> Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit
> Production
>
> Your answer does help but I'm still looking for more detail.
> For example, if I change the search term to '"G%'  (that is a single-
> quote, followed
> by a double-quote, followed by the G character, etc.) then by your
> reasoning the
> query is also meaningless but it doesn't give me an error and returns
> results.  Why
> would that be?
>
> I'm not using the HASPATH or INPATH functions.  I've been reading the
> Oracle Text Reference
> and Application Developers Guide (b14217 and b14218) and I haven't
> found any mention
> of how punctuation is handled - specifically how double-quotes are
> treated.
> I'm about to read more about the lexer as that might give me more
> information.
>
> Kevin.

For the purposes of text indexing punctuation is generally treated as whitespace and is ignored (however, it may be used for theme indexing.) When querying, double-quotes may be used for value testing in XML queries (like 'term INPATH (A[_at_B="value"])', which would search XML documents for term in A top-level elements which has B attribute whose value is 'value',) but otherwise they have no meaning and query parser reaction on them is generally undefined - it may ignore them or throw an error. For example, '"""G%' would work as if the leading double-quotes are not there (they are ignored,) but 'G"%', as we know, throws an exception. If you think that this particular behavior you're seeing is incorrect, I'd suggest that you open a SR with Oracle Support.

Regards,

   Vladimir M. Zakharychev
   N-Networks, makers of Dynamic PSP(tm)    http://www.dynamicpsp.com Received on Fri May 08 2009 - 04:37:39 CDT

Original text of this message