Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Does anyone have experience using inso filters for all kind o f documents for Oracle text

RE: Does anyone have experience using inso filters for all kind o f documents for Oracle text

From: Feighery Raymond <Raymond.Feighery_at_churchill.com>
Date: Wed, 19 May 2004 16:15:29 +0100
Message-ID: <817D2444710B934B9F7B8A1DAAF432D601F2AEBE@brcexm03>


It looks like what you want to do is the same as a unix grep. Anyway, inso filtering works for pdf files. Combining your code with my example (the readme.pdf is the Oracle9i = Database
Documentation Readme):

SQL> set serverout on echo on
SQL> drop table hdocs;

Table dropped.

SQL>=20
SQL> create table hdocs (

  2       id number primary key,
  3       fmt varchar2(10),
  4       text varchar2(80)

  5 );

Table created.

SQL>=20
SQL> insert into hdocs values(1, 'binary', '/tmp/woods.doc');

1 row created.

SQL> insert into hdocs values(2, 'binary', '/tmp/readme.pdf');

1 row created.

SQL>=20
SQL> create index hdocsx on hdocs(text) indextype is ctxsys.context   2 parameters ('datastore ctxsys.file_datastore   3 filter ctxsys.inso_filter
  4 format column fmt');

Index created.

SQL>=20
SQL>=20
SQL> select     count(*)
  2  from       hdocs
  3  where      contains(text,'woods')>0
  4 /

  COUNT(*)


         1

SQL>=20
SQL> declare
  2 resarr ctx_query.browse_tab;
  3 begin
  4
ctx_query.browse_words('HDOCSX','woods',resarr,10,CTX_QUERY.BROWSE_AROUN= D);

  5          for i in 1..resarr.count loop
  6                  dbms_output.put_line(resarr(i).word || ':' ||
resarr(i).doc_count);
  7          end loop;

  8 end;
  9 /
WHOSE:1
WIND:1
WINDOWS:1
WITHOUT:2
WOODS:1
WORK:1
WRITING:1
WWW:1
YEAR:1
YOU:1 PL/SQL procedure successfully completed.

SQL> declare
  2 resarr ctx_query.browse_tab;
  3 begin
  4
ctx_query.browse_words('HDOCSX','Database',resarr,10,CTX_QUERY.BROWSE_AR= OUND
);

  5          for i in 1..resarr.count loop
  6                  dbms_output.put_line(resarr(i).word || ':' ||
resarr(i).doc_count);
  7          end loop;

  8 end;
  9 /
CORRESPONDING:1
CUSTOMERS:1
DARK:1
DARKEST:1
DATABASE:1
DEADLINES:1
DEEP:1
DESIGNED:1
DIFFERENCES:1
DIRECTLY:1 PL/SQL procedure successfully completed.

-----Original Message-----

From: Juan Cachito Reyes Pacheco [mailto:jreyes_at_dazasoftware.com] Sent: Wednesday, May 19, 2004 3:37 PM
To: oracle-l_at_freelists.org
Subject: Re: Does anyone have experience using inso filters for all = kind
of documents for Oracle text=20

Thanks

My problem is the following
I want to get the line where the text serach is For example serachibng Oracle you get

Doc1 ...... Oracle Corporation is a business..... Doc2 ...Oracle database is currently 10g..... etc.

There is a command, but in a blob storing a pdf document I get = unreadable
text

SQL> declare
  2 resarr ctx_query.browse_tab;
  3 begin
  4
ctx_query.browse_words('SEARCH_IDX','Database',resarr,10,CTX_QUERY.BROWS= E_A
ROUND);
  5 for i in 1..resarr.count loop
  6 dbms_output.put_line(resarr(i).word || ':' || = resarr(i).doc_count);
  7 end loop;
  8 end;
  9 /

SQL> set serveroutput on
SQL> /
9,999.99i:1
999,999,999:1
999,999,999,999:1
99999999999:1
9999999999S:1
DATABASEADMINISTRATOR:1
DATABASEADMINISTRATORiS:1

=E3:1
=E3--:1
=F0:2


If you know how to do it using Oracletext or Oracle Ultasearch please

Juan Carlos Reyes Pacheco
OCP
Database 9.2 Standard Edition



Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--

Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
________________________________________________________________________=
___=20

This email and any attached to it are confidential and intended only = for the
individual or entity to which it is addressed. If you are not the = intended
recipient, please let us know by telephoning or emailing the sender. = You
should also delete the email and any attachment from your systems and = should
not copy the email or any attachment or disclose their content to any = other
person or entity. The views expressed here are not necessarily those = of
Churchill Insurance Group plc or its affiliates or subsidiaries. Thank = you.=20

Churchill Insurance Group plc. Company Registration Number - 2280426. England.=20

Registered Office: Churchill Court, Westmoreland Road, Bromley, Kent = BR1
1DP.=20



Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--

Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
Received on Wed May 19 2004 - 10:13:09 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US