Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> interMedia Text (ConText) and base_letter in 8i

interMedia Text (ConText) and base_letter in 8i

From: Honza Pazdziora <adelton_at_fi.muni.cz>
Date: Thu, 13 May 1999 14:03:02 GMT
Message-ID: <FBoCD2.4nz@news.muni.cz>


Hello,

A couple of month ago I had a problem with making base_letter work under 8.0.4 and ConText 2.3.6. Then, the solution was to start the ctxsrv with NLS_LANG set to US7ASCII and not the database charset EE8ISO8859P2. My personal explanation is that it resulted in conversion from iso-8859-2 to us-ascii during data fetch from Oracle server to the ctxsrv, and that the base_letter didn't make any sense.

Now we're considering converting the database to 8i so I checked the interMedia. After resolving the little problems with migrate.sql that should help you convert the structures to the 8.1.5 syntax (dstore instead of datastore, not specifying the ctxsys part in the indextype specification), I got interMedia text working, ... except you guessed it -- base_letter.

I have database created with NLS_CHARACTERSET EE8ISO8859P2. My approach is

---
create table ctx_test1 (id number, doc clob);

alter table ctx_test1 add primary key (id);
insert into ctx_test1 values (1, 'krtek leta');
insert into ctx_test1 values (3, 'krtek létá');
begin ctx_ddl.create_preference('strip_dia1', 'BASIC_LEXER'); end;
/

begin ctx_ddl.set_attribute('strip_dia1', 'base_letter', 'YES'); end;
/

create index ctx_test1_ctx on ctx_test1(doc)
	indextype is ctxsys.context
	parameters (' datastore "CTXSYS"."DEFAULT_DATASTORE"
	filter "CTXSYS"."NULL_FILTER" lexer strip_dia1
	wordlist "CTXSYS"."NO_SOUNDEX" storage "CTXSYS"."DEFAULT_STORAGE"');
select * from ctx_test1 where contains(doc, 'leta', 1) > 0; select * from ctx_test1 where contains(doc, 'létá', 1) > 0; ---

The first select only returns the row with 'leta', the second only with 'létá', while the acutes should have been stripped down and both queries should (and they do under 8.0.4) have returned both rows.

I've also checked that for example the mixed_case attribute works just fine, changing the behaviour accordingly to it's setting.

The sqlplus is running with NLS_LANG=american_america.EE8ISO8859P2 and I've tested this with ctxsrv running with american_america.EE8ISO8859P2, american_america.US7ASCII and NLS_LANG unset -- no difference.

However, since the ctxsrv doesn't need to run at all to do alter index ctx_test1_ctx rebuild, I do not think it matters anymore. So my uncertainty about base_letter is still there. Does it work at all?

I'd appreciate and hint, even a note that it doesn't work and that I shouldn't try at all.

Thanks,

--



 Honza Pazdziora | adelton@fi.muni.cz | http://www.fi.muni.cz/~adelton/  make vmlinux.exe -- SGI Visual Workstation Howto
Received on Thu May 13 1999 - 09:03:02 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US