Re: How to use oracle text to search multiple columns across multiple tables join

From: Vladimir M. Zakharychev <vladimir.zakharychev_at_gmail.com>
Date: 5 May 2006 23:58:11 -0700
Message-ID: <1146898691.400618.273560_at_j33g2000cwa.googlegroups.com>


>How can I use ORACLE TEXT to perform full text search on multiple
>columns across multiple tables join? For example consdier the following
>3 tables
>
>Table Book {
> bookid number(8),
> title varchar2(2000),
> description varchar2(4000)
>}
>
>Table Author {
> authorid number(8),
> authorDescription varchar2(4000),
> age number(3)
>}
>
>Table AuthorToBook {
> bookid number(8),
> authorid number(8)
>}
>
>I will join the above 3 tables and then perform search with the
>following constraint
>
>1. The word "network" must appear in Author.authorDescription and
>(Book.description or Book.Title)
>2. author's age must be > 20
>
>
>One way to achieve this is create a CTXSYS.CONTEXT type index for each
>of the varchar2 columns in the 3 table and then use the following query
>
>
>select Book.*, Author.* from Book, AuthorToBook, Author
>where Book.bookid = AuthorToBook.bookid and AuthorToBook.authorid =
>Author.id
>and Author.age > 20
>and contains(Author.authorDescription, 'network', 1) > 0
>and (contains(Book.title, 'network', 1) > 0 or
>contains(Book.description, 'network', 1) > 0)
>
>
>This query is very inefficient. Is there anyway to create one index
>across all the varchar2 columns to achive the same purpose?
>

One way to achieve this could be DETAIL_DATASTORE, where indexed text is in detail table while the index is created on master table. Unfortunately, in your case there is more than one detail table, the case not covered by DETAIL_DATASTORE.

However, not all is lost: you can use USER_DATASTORE for the index. You need to create your own text feeding procedure that will feed the Text indexing engine with text to be indexed. Assuming you're on 10g, you could do it like this:

create or replace procedure my_text_feeder ( r in rowid
 ,c in out nocopy clob)
-- names of the arguments can be any,
-- but types MUST be ROWID, CLOB
as
begin

   for x in (select B.title, B.description, A.authordescription

               from Book B, AuthorToBook AB, Author A
              where AB.rowid = r
                and B.bookid = AB.bookid
                and A.authorid = AB.authorid)
   loop
     -- we will generate XML document from our data to
     -- take advantage of automatic sectioning
     dbms_lob.writeappend(c, 19, '<root><book><title>');
     dbms_lob.writeappend(c, length(x.title), x.title);
     dbms_lob.writeappend(c, 21, '</title><description>');
     dbms_lob.writeappend(c, length(x.description), x.description);
     dbms_lob.writeappend(c, 40,
'</description></book><authordescription>');
     dbms_lob.writeappend(c, length(x.authordescription),
x.authordescription);
     dbms_lob.writeappend(c, 27, '</authordescription></root>');
   end loop;
end;
/

Then, you create a preference for indexing:

begin
  ctx_ddl.create_preference('bookinfodatastore', 'user_datastore');   ctx_ddl.set_attribute('bookinfodatastore', 'procedure', 'my_text_feeder');
  ctx_ddl.set_attribute('bookinfodatastore', 'output_type', 'CLOB'); end;
/

Then you add a fake text column to the AuthorToBook table that the index will be built on (Text indexes can only be created on text data types, you can't create Text index on a number column even with user datastore):

alter table AuthorToBook add (text varchar2(1)) /

And then you create the index:

create index ctx_books on AuthorToBook(text) indextype is ctxsys.context
parameters ('datastore bookinfodatastore section group ctxsys.auto_section_group')
/

You can then query the index like this:

select Book.*, Author.* from Book, AuthorToBook, Author where Book.bookid = AuthorToBook.bookid and AuthorToBook.authorid = Author.authorid and Author.age > 20
and contains(authortobook.text,
'network within authordescription and (network within title or network within description)') > 0

Note that there's single CONTAINS operator in the query and the text query takes advantage of automatic sectioning and uses WITHIN keyword to restrict term search to particular section corresponding to source column (that's why we created XML in feeder procedure.)

If you're on 9i or before, things are a bit more complex as the feeder procedure must be owned by CTXSYS, but needs to be able to read data it aggregates and index owner should be able to execute it, so a few extra grants are due (and the procedure must use fully qualified table names.) This restriction had been lifted in 10g.

Hth,

    Vladimir M. Zakharychev
    N-Networks, makers of Dynamic PSP(tm)     http://www.dynamicpsp.com Received on Sat May 06 2006 - 08:58:11 CEST

Original text of this message