Home » Server Options » Text & interMedia » How to Retrive Clob fields Faster
How to Retrive Clob fields Faster [message #125355] Fri, 24 June 2005 12:10 Go to next message
genialsenthil
Messages: 26
Registered: June 2005
Location: Chennai
Junior Member
Hi,
Help me to retrive results faster in oracle. I am using CLOB field to store document & richtext files. I have created CTXSYS.CONTEXT in my table on the clob field.

The following is my Db(Oracle) and Query Details...

Table Name : Tbl_Resume (Over 178000 Records operating)
=======================
CANCODE NOT NULL VARCHAR2(9) == Candidate Id
RESDRIVE VARCHAR2(100) == Resume Path Domain Id
RESPATH VARCHAR2(200) == Resume Drive Location
RESCONT VARCHAR2(100) == Resume File Name
FILECONT CLOB == Resume File Stored Field Name
KEYENTRYDET NUMBER(2) == Key Entry Identification Number
KEYENTRYBY VARCHAR2(50) == Entry User Name
KEYMODBY VARCHAR2(50) == Modified User Name

Query to Search Keywords in Resume
==================================
I have tried out the following queries..

1. Select count(CanCode) from Tbl_Resume Where upper(FileCont) Like upper('%search_text%');
- takes long time around 30 mins
2. Select count(CanCode) from Tbl_Resume Where dbms_lob.instr(FileCont,'search_text',1,1)>0;
- takes long time around 30 mins
3. I have tried out 'Contains' operator in this. But i have get the result as no rows selected. The Query as follows....
Select count(cancode) from tbl_resume where Contains(filecont,'a')>0;
--- will always return count(cancode)=0;

If I search with same thing with some other where class condition (filter more resumes i will get result quickly)

Please Let me Know Any other option to Search the keywords in the resume file content. I have to finish these KeyWord Search as quick as possible. So pls do your best ....

With Thanks and Regards,
Senthil Kumar.
Re: How to Retrive Clob fields Faster [message #125372 is a reply to message #125355] Fri, 24 June 2005 14:53 Go to previous message
Barbara Boehmer
Messages: 7974
Registered: November 2002
Location: California, USA
Senior Member
Query #3 is the way to go, as long as you are not searching for common words like 'a', 'the', 'and', 'of', 'is', and so on. If you searched for a tokenized keyword, then you should have gotten results. Please see the demonstration below, which shows which words are tokenized and which are ignored.

-- table and data for demonstration:
scott@ORA92> CREATE TABLE tbl_resume
  2    (CANCODE     VARCHAR2(9),
  3  	FILECONT    CLOB)
  4  /

Table created.

scott@ORA92> INSERT ALL
  2  INTO tbl_resume (cancode, filecont) VALUES (1, 'I like using Oracle databases.')
  3  INTO tbl_resume (cancode, filecont) VALUES (2, 'Oracle is a good company.')
  4  INTO tbl_resume (cancode, filecont) VALUES (3, 'Some of the words are ignored.')
  5  SELECT * FROM DUAL
  6  /

3 rows created.


-- index:
scott@ORA92> CREATE INDEX tbl_resume_keywords_idx
  2  ON tbl_resume (filecont)
  3  INDEXTYPE IS CTXSYS.CONTEXT
  4  /

Index created.


-- words that Oracle tokenized, ignoring common words like 'a':
scott@ORA92> SELECT token_text FROM dr$tbl_resume_keywords_idx$i
  2  /

TOKEN_TEXT
----------------------------------------------------------------
COMPANY
DATABASES
GOOD
I
IGNORED
LIKE
ORACLE
USING
WORDS

9 rows selected.


-- query to find tokenized keyword:
scott@ORA92> COLUMN filecont FORMAT A30 WORD_WRAPPED
scott@ORA92> SELECT cancode, filecont, SCORE (1)
  2  FROM   tbl_resume
  3  WHERE  CONTAINS (filecont, 'Oracle', 1) > 0
  4  /

CANCODE   FILECONT                         SCORE(1)
--------- ------------------------------ ----------
2         Oracle is a good company.               4
1         I like using Oracle databases.          4


-- will not find ignored word like 'a':
scott@ORA92> COLUMN filecont FORMAT A30 WORD_WRAPPED
scott@ORA92> SELECT cancode, filecont, SCORE (1)
  2  FROM   tbl_resume
  3  WHERE  CONTAINS (filecont, 'a', 1) > 0
  4  /

no rows selected

Previous Topic: Search within an arabic/urdu document using Oracle Text.
Next Topic: Oracle Text index on multiple tables
Goto Forum:
  


Current Time: Mon Sep 01 15:12:06 CDT 2014

Total time taken to generate the page: 0.15094 seconds