Re: Oracle Text and searches across columns
Date: Mon, 4 Feb 2008 02:10:56 -0800 (PST)
Message-ID: <36f440bf-6a7e-40f4-8d60-c5aa86423f37@c4g2000hsg.googlegroups.com>
On Feb 3, 11:34 pm, Jake <jgarfi..._at_earthlink.net> wrote:
> Say if I am making a database of college classes across many colleges,
> where people can search and find classes.
>
> So, for example, each class has a:
>
> level - bachelors, graduate (for example)
> college name- State U, or SU as short name, along with other colleges
> semester - Fall 2008...
> Category - Biology, Math, Music etc.
> course number - BIO 101
> description - a full text description of the course.
> etc.
>
> as you can imagine, all of these things are in many tables in many
> columns. The "full description" that the user sees actually contains
> information from a join of many columns.
>
> What I would like is for the user to be able to search given a text
> phrase, and given a list of hits from all classes where the "full
> description" contains every one of the words they search for. The
> words do not have to be in any particular order. Example searches:
> "Biology 101" "BIO 101" "SU Math Fall 2008"
>
> I haven't seen an Oracle Text example that handles this type of case.
>
> What is the best way to do this? Can any one point me in the right
> direction?
If "full description" is a "computed" column (that is, you dynamically join relevant columns from different sources and present that to the user,) the most straightforward solution is to create a USER_DATASTORE with a PL/SQL procedure that will aggregate all this data into a CLOB and present that to Oracle Text for indexing. For example:
CREATE TABLE CLASSMATES
( ID NUMBER(10)
,FIRST_NAME VARCHAR2(100) ,LAST_NAME VARCHAR2(100) ,MIDDLE_NAME VARCHAR2(100) ,.... ,FULL_DESC CHAR(1) -- this is the placeholder column forindexing
,CONSTRAINT PK$CLASSMATES#ID PRIMARY KEY (ID) )
/
-- some related tables are built here... ...
- create the USER_DATASTORE preference. BEGIN
- this is the procedure that will synthesize the document for indexing
CTX_DDL.SET_ATTRIBUTE('MY_FULL_DESC_DATASTORE','PROCEDURE','SYNTHESIZE_FULL_DESC');
- 'CLOB' is default if not specified, included here for completeness
CTX_DDL.SET_ATTRIBUTE('MY_FULL_DESC_DATASTORE','OUTPUT_TYPE','CLOB');
END;
/
- this is our document synthesizer
- it MUST have two arguments: first is IN ROWID and
- second is IN OUT NOCOPY <output_type>
- names of the arguments are not important, only their types and
order
CREATE OR REPLACE PROCEDURE SYNTHESIZE_FULL_DESC(RID IN ROWID, C IN
OUT NOCOPY CLOB)
IS
- DBMS_LOB.WRITEAPPEND is pretty expensive, so we will buffer
- our output in this local variable. sBuf VARCHAR2(32767 BYTE) := NULL;
- a buffering string appender procedure, also automatically
- adds space character to every appended token to delimit them.
PROCEDURE APPND( S IN VARCHAR2, bFlush IN BOOLEAN := FALSE )
IS
BEGIN
IF S IS NOT NULL AND LENGTHB(sBuf) + LENGTHB(S || ' ') > 32000
THEN
DBMS_LOB.WRITEAPPEND(C, LENGTH(sBuf), sBuf);
sBuf := S || ' ';
ELSE
sBuf := sBuf || S || ' ';
END IF;
IF bFlush THEN
DBMS_LOB.WRITEAPPEND(C, LENGTH(sBuf), sBuf);
sBuf := NULL;
END IF;
END APPND;
BEGIN
FOR X IN (SELECT * FROM CLASSMATES WHERE ROWID=RID) LOOP
APPND(X.FIRST_NAME);
APPND(X.LAST_NAME);
APPND(X.MIDDLE_NAME);
- can be repeated several times for different detail tables
- or you can join all details here, or you can join them in the
- outer cursor. FOR Y IN (SELECT * FROM SOME_DETAIL_TABLE WHERE CLASSMATE_ID=X.ID) LOOP APPND(Y.WHATEVER); .... END LOOP; END LOOP;
- flush whatever is in the buffer to the CLOB APPND(NULL, TRUE);
/
- now we create the index on our "full description" placeholder
- and specify our custom data store the data will come from. CREATE INDEX CTX$CLASSMATES#FULL_DESC ON CLASSMATES.FULL_DESC INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS ('DATASTORE MY_FULL_DESC_DATASTORE SYNC(ON COMMIT)');
- now we can query the base table and all details will be searched
- as well thanks to our document synthesizer. SELECT ID, SCORE(0) AS SCORE FROM CLASSMATES WHERE CONTAINS(FULL_DESC, 'my query terms', 0) > 0;
One important note is due here: if any data in any detail table is changed, the base table should be updated as well, otherwise that changed detail data will not be re-indexed because the index only depends on the base table and will not "see" any changes of data it actually indexes. A simple no-work update is sufficient:
UPDATE CLASSMATES SET FULL_DESC=FULL_DESC WHERE ID=:CHANGED_DETAIL_ID; Despite the fact that this update doesn't change anything, Oracle Text will still add the ROWID of "updated" column to the list of rows pending indexing.
For more information: Oracle Text Reference for your Oracle release, Google (I believe I posted similar solution to a similar problem a few years back in this very group.)
Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com
Received on Mon Feb 04 2008 - 04:10:56 CST