Re: Index probleme in oracle text
Date: Wed, 22 Oct 2003 18:04:37 -0400
Message-ID: <vpdvv5q3m0d5e_at_corp.supernews.com>
This is how I did it in an 8i database recently:
Installing Intermedia for Version 8173
Steps 3
1. Check Your OS Directory Structure 3
2. Set Your Environment Variables 3
2a. Check Your Init.ora Parameters 3
3. Install InterMedia 3 4. Make Sure extproc is Working 4 5. Look for CTXSYS Errors 5 6. Do all Necessary Grants 5 7. Create the Indexs Alicia Needs in prdp1 5Tools 6
Scripts to check Intermedia installed properly 6 Dropping InterMedia 8
Troubleshooting Index Creation 8
Troubleshooting Extproc or Connection Problems 9
A) Modify the NET Configuration if Necessary 9 B) Test the NET Configuration 10 C) Check on any Problem with Intermedia CLOB Size Limit 11Troubleshooting DRG-11207 Problems 11
Introduction 11
Common Errors 11
Environmental Settings 12
Supported Document Attribute Checklist 13 Further References 14
Tom Kyte Notes 14
Steps
1. Check Your OS Directory Structure
Make sure the OS directory for Bfile storage (PROT_DOC_DIR) has all the
correct OS privileges. That is be sure users can write to it.
Connect as sys and run a statement like this to create your directory.
create directory prot_doc_dir as '/app/esprit/production' ;
or
create directory test_dir as '/ora_sw/admin/SID' ;
If you want to see whether a directory exists you can run this:
select * from dba_directories;
Make sure the files are placed in PROT_DOC_DIR '/app/esprit/production'
The INSO filter files are normally in /ora_sw/8.1.7/ctx/lib/ and have a
suffix of .flt The OS user (Oracle) should be able to execute these files.
Also INSO needs to write files in a subdirectory of the HOME environmnet
variable ( normally /ora_sw/ ). This directory should be named .oit and be
writable by user Oracle. So your full path to the directory looks like
/ora_sw/.oit/
2. Set Your Environment Variables
We sent these to Ted Holdgate because when we made the changes they did not
persist. Even sending them to Ted it is important to let him know they must
persist after entering or changing a SID. Basically, Ted adds lines to
oraenv in the /ora_sw/lbin folder to include the ctx information.
- PATH (to additionally include $ORACLE_HOME/ctx/bin) i.e.
PATH=/ora_sw/8.1.7/bin:/ora_sw/8.1.7/ctx/lib:/ora_sw/8.1.7/ctx/bin:/usr/loca
l/bin:
/opt/bin:/bin:/usr/bin:/usr/sbin:/usr/ccs/bin:.:/e
- LD_LIBRARY_PATH (to additionally include $ORACLE_HOME/ctx/lib &
$ORACLE_HOME/lib
i.e.
LD_LIBRARY_PATH=/ora_sw/8.1.7/lib:/ora_sw/8.1.7/ctx/lib:/ora_sw/8.1.7/ctx/bi
n: .
Other values that have influence on the indexing process are:
- TMPDIR (on Unix). The temporary directory is used to hold the temporary
files being written during
indexing binary documents like Excel Spreadsheets. Point this variable to a
directory where you have enough space to hold those files. i.e.
TMPDIR=/var/tmp
2a. Check Your Init.ora Parameters
Remove the TEXT_ENABLE INIT.ORA parameter or set it to FALSE. This parameter
is no longer used in Oracle 8i, and will actually prevent 8i from operating
correctly
text_enable = false
3. Install InterMedia
In SQL*Plus connected as internal issue these commands:
connect internal
set feedback on
set serveroutput on
_at_/ora_sw/8.1.7/ctx/admin/dr0csys.sql *** tools temp01
where
*** is the ctxsys password (use the same password as for user system)
tools is the default tablespace for ctxsys
temp01 is the temporary tablespace for ctxsys
Note: The above module creates the user CTXSYS and grants full privileges
to CTXSYS in
order to create and insert into result tables, execute callbacks, rewrite
queries, and perform system cleanup. )
Next issue these grants:
connect internal
grant execute on sys.dbms_lock to ctxsys;
grant execute on sys.dbms_pipe to ctxsys;
Now connect as ctxsys in SQL*Plus while still in /ora_sw/8.1.7/ctx/admin/
connect ctxsys
_at_dr0inst /ora_sw/8.1.7/ctx/lib/libctxx8.so
Note: The above module installs all Oracle database objects required by the
text system. This includes:
- Data dictionary tables, views, sequence, packages
- Server management tables, views and packages
- Dispatcher packages
- Service queue objects
Also you can check at the end of the install that the library is correctly
installed by connecting as ctxsys and selecting from user_libraries:
connect ctxsys
select library_name, file_spec, dynamic, status from user_libraries;
LIBRARY_NAME FILE_SPEC D
STATUS
- ---------------------------------------- --
DR$LIBX /ora_sw/8.1.7/ctx/lib/libctxx8.so Y VALID
- ---------------------------------------- --
The next step is to install appropriate language-specific default preferences. When you use CREATE INDEX to create an index or ALTER INDEX to manage an index, you can optionally specify indexing preferences in the parameter string. There are seven preference classes: - Lexer, defines the language being indexed. (<- LANG)
- Wordlist, defines the expantion of stem and fuzzy queries. (<- LANG)
- Stoplist, defines words and themes that are not be indexed. (<- LANG)
- Datastore, defines document storage.
- Filter, defines standards for converion of documents to plaintext.
- Storage, defines the storage of the index tables.
- Section group, enables possibilities to define document sections. An installation with the Database Configuration Assistant install the US default language preferences. To manually install the US default preferences, log into sqlplus as CTXSYS, and run 'drdefus.sql', as described below: connect ctxsys _at_/ora_sw/8.1.7/ctx/admin/defaults/drdefus.sql 4. Make Sure extproc is Working InterMedia Text is based on routines called by external procedure calls. For example, if you are indexing a word document, you will have an external procedure call to the Inso Filter for the actual indexing process. To see if extproc has been setup correctly, you can run the following procedure: set serveroutput on set feedback on connect ctxsys execute ctx_output.start_log('logfile.txt'); commit;
If you get "PL/SQL procedure successfully completed", you're all set. If
not you'll have to work on your extproc connection.
5. Look for CTXSYS Errors
In SQL*Plus
connect ctxsys
select * from CTXSYS.CTX_INDEX_ERRORS
where to_date(err_timestamp, 'DD-MON-YY') > to_date('30-MAR-03','DD-MON-YY')
;
--
select * from CTXSYS.DR$INDEX_ERROR
where to_date(err_timestamp, 'DD-MON-YY') > to_date('30-MAR-03','DD-MON-YY')
;
Where the first is a view and the second a table.
delete from CTXSYS.DR$INDEX_ERROR ;
6. Do all Necessary Grants
Now if you are going to be using InterMedia on LOBs you may have to look up
and do additional grants such as:
connect internal
grant read on directory prot_doc_dir to ctxsys;
grant read on directory prot_doc_dir to DOC;
connect ctxsys
grant ctxapp to DOC;
where you have a directory prot_doc_dir for your Bfiles and you will have to
grant the ctxapp role to users who own the LOBs.
In the Tools section of this document are scripts you can use to see whether
InterMedia installed correctly. Note that in versions of Oracle earlier
than 8.1.7 you had to have a shell script to startup the context server
processes. From 8.1.7 onwards this is no longer necessary.
7. Create the Indexs A Needs in prdp1
Create a new index as below by connecting as the doc user in SQL*plus
connect doc
begin
ctx_ddl.create_preference('DOCS_INSO_FILTER', 'INSO_FILTER');
end;
/
begin
ctx_ddl.create_preference('DOCS_INSO_FILTER', 'INSO_FILTER');
end;
/
commit;
--
create index doc_document_idxtest1 on doc_document(document_object)
indextype is ctxsys.context
parameters ('datastore DirectPref filter DOCS_INSO_FILTER' );
--
alter index doc_document_idxtest1 rebuild parameters('SYNC');
--
create index doc_document_idxtest2 on doc_document(document_text)
indextype is ctxsys.context
parameters ('datastore DirectPref filter DOCS_INSO_FILTER' );
--
alter index doc_document_idxtest2 rebuild parameters('SYNC');
drop index doc.doc_document_idxtest1 ;
drop index doc.doc_document_idxtest2 ;
Using step 5 above, check for errors in the error tables. Then while
connected in SQL*Plus as doc check for the token text to confirm if the
index is created
connect doc
select token_text from DOC.DR$DOC_DOCUMENT_IDXTEST$I ;
Finally run a test query
select document_id, score(1)
from doc.doc_document
where contains(document_object, 'Inhibition', 1) > 0
order by score(1) desc ;
Tools
Scripts to check Intermedia installed properly
Use SQL*Plus
connect messnerv_at_xxxxx.gro.pas.com
create table bird (bird_id number, species varchar2(20));
insert into bird values (1, 'robin');
insert into bird values (2, 'black crow');
insert into bird values (3, 'green crow');
insert into bird values (4, 'green robin');
commit;
alter table bird add constraint bird_pk primary key (bird_id);
create index bird_I1 on bird(species) indextype is ctxsys.context;
The fact that the indextype is context will show up in the ityp_name column
of dba_indexes. Unlike regular indexes, these context indexes are not
updated automatically. After changes in the data, you need to run
alter index bird_I1 rebuild parameters('SYNC');
select bird_id from bird where contains (species, 'crow') > 0;
BIRD_ID
----------
2
3
Using Context. The > 0 is not for 'greater than 0 occurrences', but for a
'score' Oracle keeps. In the case of a simple search there is not much of a
score to keep but if it comes to searching for 'crow' fairly near 'black'
then a score makes sense. You can see the score as follows:
select bird_id, species, SCORE(10) from bird
where contains(species, 'crow', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
2 black crow 4
3 green crow 4
You can do Boolean searches:
select bird_id, species, SCORE(10) from bird
where contains (species, 'crow AND black', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
2 black crow 4
If you want the word 'AND' to be part of the search phrase, you must
surround it with braces:
select bird_id, species, SCORE(10) from bird
where contains (species, 'crow {AND} black', 10) > 0;
no rows selected
You can also use the & for the Boolean AND.
For OR, use OR or |:
select bird_id, species, SCORE(10) from bird
where contains (species, 'crow | robin', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
1 robin 4
2 black crow 4
3 green crow 4
4 green robin 4
To find species containing crow but not black:
select bird_id, species, SCORE(10) from bird
where contains (species, 'crow MINUS black', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
3 green crow 4
You can use - instead of MINUS. If you want to weight the scores, you can
multiply by a factor in the range .1 to 10:
select bird_id, species, SCORE(10) from bird
where contains (species, 'crow*7 - green', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
2 black crow 28
3 green crow 24
Suppose you want a term which occurs fairly near another term:
select bird_id, species, SCORE(10) from bird
where contains (species, 'crow NEAR green', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
3 green crow 14
You can use ; instead of NEAR. ConText finds whole words, not parts of
words:
select bird_id, species, SCORE(10) from bird
where contains (species, 'obi', 10) > 0;
no rows selected
However, you can use the same wildcards you use with LIKE:
select bird_id, species, SCORE(10) from bird
where contains (species, '%obin', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- - ---------
1 robin 4
4 green robin 4
To find verbs that are tense variants of given words, use the dollar sign:
select bird_id, species, SCORE(10) from bird
where contains (species, '$crowed', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
2 black crow 4
3 green crow 4
To find close matches, use ?:
select bird_id, species, SCORE(10) from bird
where contains (species, '?craw', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
2 black crow 4
3 green crow 4
For the SOUNDEX option, use !:
select bird_id, species, SCORE(10) from bird
where contains (species, '!croh', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
2 black crow 4
3 green crow 4
You can combine search methods:
select bird_id, species, SCORE(10) from bird
where contains (species, '$?crowed AND green', 10) > 0;
BIRD_ID SPECIES SCORE(10)
---------- -------------------- ----------
3 green crow 4
You can also use (, ) on the binary operators.
Dropping InterMedia
In SQL*Plus connected as internal issue this command:
connect internal
_at_/ora_sw/8.1.7/ctx/admin/dr0dsys.sql
Troubleshooting Index Creation
Here is something to check if you have problems with Intermedia index
creation.
Before trying the create index command again do these things:
connect ctxsys
begin
CTXSYS.CTX_OUTPUT.START_LOG('idxtest.log');
end;
This will create a log file so that you can see the progress of text index
creation. The CTX_OUTPUT.START_LOG('<filename>') procedure begins logging
to the file specified in '<filename>' and CTX_OUTPUT.END_LOG stops logging.
By default, CTX_OUTPUT.START_LOG creates the specified log file in the
$ORACLE_HOME/ctx/log directory.
So verify $ORACLE_HOME/ctx/log directory is available to you.
After text index creation is completed, the CTX_OUTPUT.END_LOG command
should be issued by the user who began index logging.
connect ctxsys
begin
CTXSYS.CTX_OUTPUT.END_LOG('idxtest.log');
end;
Then you can review the log to help pinpoint the problem area.
If you are making indexes involving bfiles, then it is vital that you have
the information you need in the environmnet variables PATH and
LD_LIBRARY_PATH. You should be able to change SIDs or re-enter the same SID
and see the ctx information persist. If it does not you need to work on
/ora_sw/lbin/oraenv until the information does persist.
Troubleshooting Extproc or Connection Problems
A) Modify the NET Configuration if Necessary
1. Listener.ora.
a) Configure an IPC listener address. For instance, change:
LISTENER =
(ADDRESS_LIST=
(ADDRESS=
(PROTOCOL=tcp)
(HOST=<hostname or IP address>)
(PORT=1521)))
to:
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS =
(PROTOCOL = IPC)
(KEY = EXTPROC0)
)
(ADDRESS =
(PROTOCOL = TCP)
(HOST = <hostname or IP address>)
(PORT = 1521)) )
)
)
b) Add a system identifier (SID) name of PLSExtProc and a program name
of
EXTPROC in the server's LISTENER.ORA file. For instance, in the
SID_LIST_LISTENER definition, insert:
(SID_DESC =
(SID_NAME=PLSExtProc)
(PROGRAM=extproc)
)
- PLSExtProc matches the CONNECT_DATA SID for
extproc_connection_data in
the tnsnames.ora.
- The PROGRAM section tells the Net8 listener to start the
external procedure process.
- The ENVS section, may be set here for UNIX, to ensure that the environment
includes
?/ctx/lib in LD_LIBRARY_PATH. This is needed so that indexing can use the
INSO filters. You add this as follows:
(ENVS = 'LD_LIBRARY_PATH=<ORACLE_HOME>/ctx/lib')
- On NT, you may need to have ORACLE_HOME set in this section as well.
(ORACLE_HOME=<ORACLE_HOME>)
2. Tnsnames.ora.
Add a net service name description entry for EXTPROC0 in the server's
tnsnames.ora file, using SID rather than SERVICE_NAME in the CONNECT_DATA
section. For example, add this to the end of tnsnames.ora:
extproc_connection_data =
(DESCRIPTION=
(ADDRESS_LIST =
(ADDRESS=
(PROTOCOL=IPC)
(KEY=EXTPROC0)
)
)
(CONNECT_DATA=
(SID=PLSExtProc)
)
)
- PLSExtProc could really be named anything.
- The connect string 'extproc_connection_data' should not be changed(even
not be in upper case).
3. Sqlnet.ora.
Add or check your domain name entry to the Sqlnet.ora file on your server,
NAMES.DEFAULT_DOMAIN. So, for example, if 'world' is your default
domain.
extproc_connection_data.world would be correct. i.e.
NAMES.DEFAULT_DOMAIN = gro.pal.com
extproc_connection_data.gro.pal.com
B) Test the NET Configuration
Since the extproc_connection_data ADDRESS section specifies ipc, make sure
that the ADDRESS_LIST of listener.ora accepts ipc connections. One way to do
this is to try to create a text index.
A quicker way to test the Net8 configuration is to do:
connect ctxsys
exec ctx_output.start_log('log');
from SQL*Plus. If you get the error: 'DRG-50704: Net8 listener is not
running or cannot start external procedures, then things are not set up
correctly. Recommended actions:
Go through the configuration.
1. Ensure that the listener is started with the new IPC settings:
type: lsnrctl status
The important service summary item to check here is the service handler for
PLSExtProc. This should return:
PLSExtProc has 1 service handler(s)
2. Test the service handler for PLSExtProc with tnsping and look for a
successful response:
tnsping extproc_connection_data
3. Check your domain name configuration
C) Check on any Problem with Intermedia CLOB Size Limit
Hi, I have this weird problem with Intermedia that is now becoming critical
for my application. My database is in UTF-8, and I'm storing XML documents
in a CLOB column. I have followed the instruction to get a lexer and a
section group that will work with XML, and the index is created with no
error, and no error appears in the ctx log, if enabled. The problem is that
everything works well for small documents (less than 2300 characters or
such). If a document is bigger than that, nothing get indexed: the
DR$MYINDEX$I table is empty. I tried to play around with storage and lob
options for both the inde xtable and the $I table, to no avail. Is it a
memory issue? What settings do I need to change. I'm trying to index, at the
moment, three documents of around 3K in size, and a quick view at v$sgastat
tells me I have 15 Megs free memory... Help!!!
You are running into bug 1619321 (base bug 1555818)--fixed in Oracle9. There
is a patch available for 8.1.7, but you will need to request the one-off
patch through an iTAR or by calling into your local support center.
Troubleshooting DRG-11207 Problems
Introduction
Most of the errors encountered during creation of Oracle Text indexes are
likely to be caused when dealing with formatted documents. These errors are
logged in the view CTX_USER_INDEX_ERRORS and can be queried from the schema
where the create index statement was executed, or table CTX_INDEX_ERRORS
queried from the CTXSYS user.
One of the errors commonly reported in these views is the DRG-11207 "exited
with status X" error. Unfortunately, this error is often not useful in
diagnosing indexing issues.
The INSO_FILTER issues are difficult to diagnose because Oracle uses a
third-party application for filtering. As a result, it is often difficult to
identify the real source of the problem. In regards to the INSO_FILTER, the
error code implies that it is unable to index a formatted document. It is
important to mention that these errors can be operating system specific and
therefore what's below is intended to provide some hints as a starting point
for analysis. Its possible that other cases may be reported and if this is
the
case be sure to log the error with Oracle Support.
( For Oracle 10i, meaningful error messages have been added for drg-11207.
This has been documented in [BUG:2473885] )
Common Errors
DRG-11207: user filter command exited with status 1
Status 1 means "Could not filter the document". It is a generic error and
indicates that INSO_FILTER failed on a given document. This can happen due
to many documented reasons: an invalid environmental setting; document is
corrupted, encrypted, password protected; document version not supported
(incompatibility) or due to a bug in INSO_FILTER.
DRG-11207: user filter command exited with status 2
Status 2 means "The INSO_FILTER has timed-out". The default value of the
timeout value for the INSO_FILTER is 120 (seconds). The default value for
the timeout_type is heuristic, which implies that if the timeout value is
reached and the INSO_FILTER has not started to write output, the indexing
operation terminates for the document row and Oracle moves to the next
document row to be indexed.
Beginning in 9.2.0.1 it is possible to change the timeout_type variable to
fixed, which allows a
user to terminate filtering processing after the TIMEOUT seconds regardless
of whether filtering is progressing normally or hanging. The default
timeout value is generally enough for filtering most of the documents. If
the timeout value is not large enough then it may timeout even before
filtering completes. PDF and Microsoft Excel files are usually more prone to
this timeout as the INSO_FILTER generally takes more time to process these
types of files. If this is the case then you can create a preference with a
larger timeout value in versions
8.1.7.1b and above. It should be noted, however, that the timeout attribute
cannot be changed dynamically. In order for the new timeout value to take
effect, it is necessary to re-create the index with the new timeout
attribute setting.
For example, in 9.2.x to alter the timeout to 600 seconds(10 minutes) and
use
the fixed timeout_type:
begin
ctx_ddl.create_preference('my_inso', 'INSO_FILTER');
ctx_ddl.set_attribute('my_inso', 'timeout', '600');
ctx_ddl.set_attribute('my_inso', 'timeout_type', 'FIXED');
end;
/
Documented reasons for timing out are:
1. the document is too large to be indexed in the alloted time set via
TIMEOUT attribute.
2. INSO filter is hanging during the filtering.
DRG-11207: user filter command exited with status 127
Status 127 points to that likely an environmental issue with the shared
library environmental variable.
DRG-11207: user filter command exited with status 137
Status 137 means that the ctxhx executable was killed as the INSO filter is
not set-up properly. Confirm that the correct environmental variables are
setup(LD_LIBRARY_PATH AND PATH) and the format of the document is supported
by your INSO filter.
Environmental Settings
Please note that any DRG-11207 error may be caused by the setting of the
environmental variables. Be sure that the PATH and shared library path which
is LD_LIBRARY_PATH (SHLIB_PATH for HP-UX and LIBPATH for AIX) are correct,
since it is this setting which actually invokes the INSO filter.
Note : it is also possible to define the environment variable
LD_LIBRARY_PATH in the listener.ora entry for extproc. This would be
included in the ENVS section of the origin database LISTENER.ORA:
ENVS =
'LD_LIBRARY_PATH=<full_pathname_of_oracle_home>/ctx/lib:$ORACLE_HOME/lib'
LD_LIBRARY_PATH should include
<full_pathname_of_oracle_home>/ctx/lib:<full_pathname_of_oracle_home>/lib
For example:
export
LD_LIBRARY_PATH=<full_pathname_of_oracle_home>/ctx/lib:$LD_LIBRARY_PATH
PATH should include
<full_pathname_of_oracle_home>/ctx/bin:<full_pathname_of_oracle_home>/bin
[NOTE:133691.1] and [NOTE:135333.1] provides possible steps to get around
this error.
Supported Document Attribute Checklist
1. Determine if the filtered document is supported
A list of supported formats is provided in the InterMedia Text documentation
pages. Please check whether the format falls in the list of supported
formats. It is important to note that each Oracle version may support
different versions of documents.
For PDF Acrobat (full version), click on File->Document_Info->General and
you should see the PDF version.
For Microsoft Word and Excel, click on File->Save As and review the "Save
as type" to determine the version.
2. Determine if the document is corrupted
To verify if a document is corrupted, open the document and view it line by
line checking for corrupted output. Please note, just holding down the
"Page Down" button to view all the pages, it's possible that potential
corruption on a particular page will go unnoticed.
3. Determine if the document is a secure or copy protected document
Password protected documents and documents with password protected contents
are not supported by the INSO filter.
For PDF Acrobat (full version), click under file->document_info->security.
If the open password or security password is set to true then it's password
protected.
For Microsoft Word, click under Tools->Options->Save. If the "Password to
open" or "Password to modify" appears as asterisks (*****), then the
document is secured.
For Microsoft Excel, click under File->Save As->Tools->General options. If
the "Password to open" or "Password to modify" appears as asterisks
(*****), then the document is secured.
4. Determine if the document is encrypted
INSO filter does not currently support encrypted documents. For PDF Acrobat
(full version), click under file->document_info->security. If security
method is none, the document is not encrypted. In other cases, it is
encrypted.
For Microsoft Word, it is not possible to encrypt this type of document
directly.
For Microsoft Excel, if the document is workbook protected then this implies
that the document is encrypted. To determine this click under
tools->protection and if the Unprotect Workbook or Unprotect
Sheet is shown, this implies that the workbook/worksheet is encrypted.
5. Determine if the document has custom embedded fonts
Stellent filters will succeed on the majority of PDF documents containing
custom fonts. Until now, there have only been a few documented cases of
custom embedded fonts causing filtering problems with PDF documents. If
there is a filtering problem with custom fonts, it will only result in
garbage tokens being produced as a result of the custom font, whereas the
remainder of the PDF document using the standard/built-in font will be
filtered properly.
To determine if custom embedded fonts are used for PDF Acrobat (full
version), click under
file->document_info->fonts and review the encoding type. The best way to
identify whether a particular custom font will filter successfully is to
highlight the character, copy and paste into Notepad. If the output
contains garbage text then it is not likely to be filtered properly.
For Microsoft Word and Excel, regardless of what font is being used, the
INSO filter should be able to extract the characters.
Further References
Oracle Text Reference Releases 9.1 and 9.2
[BUG:2473885] BETTER ERROR MESSAGES FOR WHEN CREATING INTERMEDIA INDEX
USING INSO FILTER
[NOTE:133691.1] Create context index fails with DRG-11207 or
fatal:libsc_fa.so
[NOTE:135333.1] CTX_DDL.SYNC_INDEX With DBMS_JOB Fails (DRG-11207) Using
INSO Filter in V817
[BUG:1795642] INTERMEDIA TEXT INDEX OF A CERTAIN EXCEL97 FILE IS NOT
CREATED IN WEBDB SITE
Tom Kyte Notes
Go to Tom's web site and search for information. I won't post it here
verbatim cause it's probably copyrighted.
Received on Thu Oct 23 2003 - 00:04:37 CEST
