Home » Server Options » Text & interMedia » How to obtain key words from the text?
How to obtain key words from the text? [message #76084] Tue, 24 February 2004 04:04 Go to next message
Valentine
Messages: 1
Registered: February 2004
Junior Member
Can anybody say me how to extract keywords from the document using OracleText. I need to build search engine and use existing documents to formulate new query. There are FREETEXT predicate in the MS SQL. I am looking something like it in the Oracle.
Re: How to obtain key words from the text? [message #76099 is a reply to message #76084] Thu, 17 June 2004 09:45 Go to previous message
Frank Naude
Messages: 4420
Registered: April 1998
Senior Member
Hi,

Not sure how SQL Server handles it, but in Oracle Text, you can use CTX_DOC.TOKENS() or CTX_QUERY.BROWSE_WORDS() to extract words from an Oracle Text index.

Here's a quick example:

SQL> CREATE TABLE docs (
  2          doc_id  NUMBER PRIMARY KEY,
  3          text    CLOB);
Table created.

SQL> INSERT INTO docs VALUES (1, 'Strings to be indexed');
1 row created.

SQL> INSERT INTO docs VALUES (2, 'Second document text');
1 row created.

SQL> COMMIT;
Commit complete.

SQL> CREATE INDEX docs_index ON docs(text) INDEXTYPE IS CTXSYS.CONTEXT;
Index created.

SQL> CREATE TABLE the_tokens (
2 query_id NUMBER,
3 token VARCHAR2(64),
4 offset NUMBER,
5 length NUMBER);
Table created.

SQL> EXEC CTX_DOC.TOKENS('docs_index', '1', 'the_tokens', 1);
PL/SQL procedure successfully completed.

SQL> EXEC CTX_DOC.TOKENS('docs_index', '2', 'the_tokens', 2);
PL/SQL procedure successfully completed.

SQL> COL token FORMAT A30
SQL> SELECT * FROM the_tokens;

QUERY_ID TOKEN OFFSET LENGTH
---------- ------------------------------ ---------- ----------
1 STRINGS 1 7
1 INDEXED 15 7
2 SECOND 1 6
2 DOCUMENT 8 8
2 TEXT 17 4


Best regards.

Frank
Previous Topic: Unicode / greek chars in Oracle 9i Lite
Next Topic: Oracle Context - Solution
Goto Forum:
  


Current Time: Tue Oct 21 04:38:27 CDT 2014

Total time taken to generate the page: 0.05573 seconds