Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> interMedia Text (ConText) and base_letter in 8i
Hello,
A couple of month ago I had a problem with making base_letter work under 8.0.4 and ConText 2.3.6. Then, the solution was to start the ctxsrv with NLS_LANG set to US7ASCII and not the database charset EE8ISO8859P2. My personal explanation is that it resulted in conversion from iso-8859-2 to us-ascii during data fetch from Oracle server to the ctxsrv, and that the base_letter didn't make any sense.
Now we're considering converting the database to 8i so I checked the interMedia. After resolving the little problems with migrate.sql that should help you convert the structures to the 8.1.5 syntax (dstore instead of datastore, not specifying the ctxsys part in the indextype specification), I got interMedia text working, ... except you guessed it -- base_letter.
I have database created with NLS_CHARACTERSET EE8ISO8859P2. My approach is
---
create table ctx_test1 (id number, doc clob);
alter table ctx_test1 add primary key (id); insert into ctx_test1 values (1, 'krtek leta'); insert into ctx_test1 values (3, 'krtek létá');begin ctx_ddl.create_preference('strip_dia1', 'BASIC_LEXER'); end;
indextype is ctxsys.context parameters (' datastore "CTXSYS"."DEFAULT_DATASTORE" filter "CTXSYS"."NULL_FILTER" lexer strip_dia1 wordlist "CTXSYS"."NO_SOUNDEX" storage "CTXSYS"."DEFAULT_STORAGE"');select * from ctx_test1 where contains(doc, 'leta', 1) > 0; select * from ctx_test1 where contains(doc, 'létá', 1) > 0; ---
The first select only returns the row with 'leta', the second only with 'létá', while the acutes should have been stripped down and both queries should (and they do under 8.0.4) have returned both rows.
I've also checked that for example the mixed_case attribute works just fine, changing the behaviour accordingly to it's setting.
The sqlplus is running with NLS_LANG=american_america.EE8ISO8859P2 and I've tested this with ctxsrv running with american_america.EE8ISO8859P2, american_america.US7ASCII and NLS_LANG unset -- no difference.
However, since the ctxsrv doesn't need to run at all to do alter index ctx_test1_ctx rebuild, I do not think it matters anymore. So my uncertainty about base_letter is still there. Does it work at all?
I'd appreciate and hint, even a note that it doesn't work and that I shouldn't try at all.
Thanks,
--