Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Oracle, Intermedia, URL DATASTORE, INSO FILTER
Hi,
I am using Oracle 8i on Windows NT. I am trying to carry out a text search
on the contents of a URL saved as a column in a table.
I created a table with two columns,
id number primary key
URL varchar2(80),
created an index as follows,
create index test_index on test(test_url) indextype is ctxsys.context parameters('datastorectxsys.URL_DATASTORE FILTER ctxsys.INSO_FILTER')
I added data to this table, and rebuilt the index using the foln. stmt.
ALTER INDEX test_index REBUILD ONLINE PARAMETERS('SYNC');
1) If the URL was a local intranet url eg.
http://my_local_intranet_host/index.html,
it did not generate any errors in the CTX_USER_INDEX_ERRORS table
2) However, if I use an external url such as
http://www.oracle.com/index.html, the CTX_USER_INDEX_ERRORS table has the
foln. error
DRG-11612: URL store: unknown host specified in
http://www.oracle.com/index.html
While performing a query such as this one SELECT id, url FROM test WHERE CONTAINS(url, 'Oracle')>0;
If the URL was a local one, I get returns, however, if the URL was a
external URL such as http://www.oracle.com/index.html, no records are
returned, as it does not get indexed. I have a feeling that I might not be
using the right syntax to query an external web site, or something is not
set right in my INSO_FILTER/URL_DATASTORE settings, can anyone help me
with this. Also, our network uses a proxy server to connect to the world
wide web. The URL DATASTORE has a no. of attributes to specify host name
of http proxy server etc., do I have to change any of these attributes, if
so how?
Also, I tried to test the INSO FILTER as mentioned in a technet document,
to see if a html file gets created from a doc file,
$ORACLE_HOME/ctx/bin/ctxhx testfile.doc testfile.txt
but, I did not get any errors neither was any testfile.txt created.
Thanks in advance.
--
Posted via CNET Help.com
http://www.help.com/
Received on Wed Feb 09 2000 - 11:30:37 CST