Re: Oracle Text for searchengine across multiple columns/tables with different datastore

From: Frank van Bortel <fvanbortel_at_netscape.net>
Date: Fri, 04 Mar 2005 10:49:45 +0100
Message-ID: <d09ara$46i$1_at_news4.zwoll1.ov.home.nl>


Nikola Pecigos wrote:
> Hi,
>
> I have the following problem:
>
> We have an Oracle 9.2 with one table "document" which contains a path
> to the filesystem. If I want to index these files (HTML, PDF, World,
> Excel, etc.), I have to use the datastore type "FILE_DATASTORE".
>
> Another table "lng_text" stores the titles and descriptions for
> multiple languages for each row in table "document".
>
> My goal is to build an index, which I can query with AND or OR
> Operator across all three columns in both tables.
>
> I read a lot about USER_DATASTORE and the posibillity to merge
> different columns with a stored procedure in one index. But how is
> this possible in this case, since one column contains a path to
> filesystem and two other columns are VARCHAR(500)?
>
>
> Thanks in advance!
>
> Greetz,
> Nikola Pecigos
>
> PS: As far as I realize it, building two or three indexes does not
> solve the problem. If I had more than one searchword like "this &
> that" and query multiple indexes,
>
> WHERE CONTAINS(t1.column, 'this & that', 10) > 1 OR
> CONTAINS(t2.column, 'this & that', 20) > 1
>
>
> I would request boths searchwords to appear together in ONE index or
> the other. But it has to be possible, that "this" is found in t1 and
> "that" in t2.

No - you would use 'this AND that' or 'this OR that' ('this | that')

Frankly, I don't see your problem.
If you build a contatenated index, as you propose, you still don't know where the word came from - the title, the description, or the document itself. All you know is the index found a (one or more) hit.

Anyway - you can use a user_datastore to concatenate your columns-to-be-indexed into a CLOB, and query it. This can be done over multiple tables, and probably over external files as well.
Performance wise, I would opt for storage *within* the database, though. The indexing process will need to read them anyway. ANd your backups will be consistent!

I'd recommend reading chapters 2 and 3 of http://otn.oracle.com/pls/db92/db92.to_pdf?pathname=text.920%2Fa96518.pdf&remark=docindex

-- 
Regards,
Frank van Bortel
Received on Fri Mar 04 2005 - 10:49:45 CET

Original text of this message