Home » Server Options » Text & interMedia » CONTAINS function not working! (split from hijacked thread by bb)
CONTAINS function not working! (split from hijacked thread by bb) [message #438293] Thu, 07 January 2010 21:18 Go to next message
ora22
Messages: 8
Registered: December 2009
Location: BOSTON
Junior Member
hi

Oracle Version : 9.2.0.6.0
Everything Default : Lexer, Stoplist etc ...

the table tab1 has data as follows

ID col1
-- -----------------
1 Jack/Jill
2 Jack & Jill
3 Jack&Jill
4 Jack and Jill
5 Jack is a friend of Jill



I have a search screen in which say the user enters the search as "Jack and Jill"

i know "AND" is a stopword and also a keyword, so i remove it from the search criterion and issue the search

SELECT col1 FROM tab1 WHERE CONTAINS(col1, 'Jack Jill', 1) > 0;

system returns

col1
-----------------
Jack/Jill
Jack & Jill
Jack&Jill

but not the row with ID=4 and col1 = "Jack and Jill"

since and is a stopword and was not indexed - in row 4 - shouldn't oracle text treat as if Jack and Jill were right next to each other ?

can i make it work like that ?

the user would be happy to see the result as

ID col1
-- -----------------
1 Jack/Jill
2 Jack & Jill
3 Jack&Jill
4 Jack and Jill

but not ID 5 as Jack and Jill were far apart ...

Would greatly appreciate your thoughts ?
Re: CONTAINS function not working! (split from hijacked thread by bb) [message #438297 is a reply to message #438293] Thu, 07 January 2010 22:15 Go to previous messageGo to next message
Barbara Boehmer
Messages: 7969
Registered: November 2002
Location: California, USA
Senior Member
Even though the "and" is not indexed, it still counts it as a word. You can enclose the "and" in curly brackets to treat it as a regular word intead of a keyword, but then you will miss the other rows. You can use the NEAR operator to specify within one word. Please see the examples below.

SCOTT@orcl_11g> 
SCOTT@orcl_11g> CREATE table tab1
  2    (ID    NUMBER,
  3  	col1  VARCHAR2 (30))
  4  /

Table created.

SCOTT@orcl_11g> SET DEFINE OFF
SCOTT@orcl_11g> INSERT ALL
  2  INTO tab1 VALUES (1, 'Jack/Jill')
  3  INTO tab1 VALUES (2, 'Jack & Jill')
  4  INTO tab1 VALUES (3, 'Jack&Jill')
  5  INTO tab1 VALUES (4, 'Jack and Jill')
  6  INTO tab1 VALUES (5, 'Jack is a friend of Jill')
  7  SELECT * FROM DUAL
  8  /

5 rows created.

SCOTT@orcl_11g> CREATE INDEX tab1_col1_idx ON tab1 (col1)
  2  INDEXTYPE IS CTXSYS.CONTEXT
  3  /

Index created.

SCOTT@orcl_11g> SELECT token_text FROM dr$tab1_col1_idx$i
  2  /

TOKEN_TEXT
----------------------------------------------------------------
FRIEND
JACK
JILL

SCOTT@orcl_11g> SELECT col1 FROM tab1 WHERE CONTAINS(col1, 'Jack Jill', 1) > 0
  2  /

COL1
------------------------------
Jack/Jill
Jack & Jill
Jack&Jill

SCOTT@orcl_11g> SELECT col1 FROM tab1 WHERE CONTAINS(col1, 'Jack {AND} Jill', 1) > 0
  2  /

COL1
------------------------------
Jack and Jill

SCOTT@orcl_11g> SELECT col1 FROM tab1 WHERE CONTAINS(col1, 'NEAR ((Jack,Jill),1)', 1) > 0
  2  /

COL1
------------------------------
Jack/Jill
Jack & Jill
Jack&Jill
Jack and Jill

SCOTT@orcl_11g> SELECT col1 FROM tab1 WHERE CONTAINS(col1, 'Jack AND Jill', 1) > 0
  2  /

COL1
------------------------------
Jack/Jill
Jack & Jill
Jack&Jill
Jack and Jill
Jack is a friend of Jill

SCOTT@orcl_11g>

Re: CONTAINS function not working! (split from hijacked thread by bb) [message #438302 is a reply to message #438293] Thu, 07 January 2010 23:41 Go to previous messageGo to next message
ora22
Messages: 8
Registered: December 2009
Location: BOSTON
Junior Member
SORRY - POSTING AGAIN WITHOUT TABS THIS TIME ...

Hi Barbara - Thanks v much for your detailed reply

our's is a datawarehousing application (with distict batch processes updating data and no concerns regarding online updates etc)

so i was wondering if this would be a viable solution:

add a column to the table (say col2) - where we remove all STOPWORDS.
ID col1                      col2
-- -----------------         ------------------
1 Jack/Jill                  Jack/Jill
2 Jack & Jill                Jack & Jill
3 Jack&Jill                  Jack&Jill
4 Jack and Jill              Jack Jill
5 Jack is a friend of Jill   Jack friend Jill

Index and Serach on Column col2.

And when the user provides the search string - remove all the STOPWORDS from the search string too (we could use CTXSYS.CTX_STOPWORDS for this).

so if user provides "jack and jill" our modified search string would be "jack jill" and so the query would be :
SELECT col1 FROM tab1 WHERE CONTAINS(col2, 'Jack Jill', 1) > 0

which should give:
ID col1                      col2
-- -----------------         ------------------
1 Jack/Jill                  Jack/Jill
2 Jack & Jill                Jack & Jill
3 Jack&Jill                  Jack&Jill
4 Jack and Jill              Jack Jill

Do you see any holes in this approach ?

Thanks as always !

[EDITED by LF: applied format preserving tags]

[Updated on: Fri, 08 January 2010 00:44] by Moderator

Report message to a moderator

Re: CONTAINS function not working! (split from hijacked thread by bb) [message #438351 is a reply to message #438302] Fri, 08 January 2010 04:40 Go to previous message
Barbara Boehmer
Messages: 7969
Registered: November 2002
Location: California, USA
Senior Member
What if somebody wants to search for the word "Jack" AND the word "Jill" as in the row with id 5?
Previous Topic: Text searching
Next Topic: Best approach for Multi Column text search
Goto Forum:
  


Current Time: Fri Aug 22 17:28:37 CDT 2014

Total time taken to generate the page: 0.09883 seconds