Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Tuning Oracle interMedia query - minimizing I/O?
To follow up on my previous post and answer some questions I've received (By the way, thanks to Martin for the help and suggestions.)
I tried the ALL_ROWS hint and performance went down a bit. ALL_ROWS took 63 seconds versus FIRST_ROWS which took about 44 seconds.
The indexed column is on
average about 4 or 5 K, however some entries are as
large as 10K, making CLOB or LONG my only alternative. The text column is of
type CLOB.
The table structure looks like this:
url_id NUMBER PRIMARY KEY,
url_date DATE,
text CLOB
When I perform the queries I am not requesting that the entire CLOB column be returned -- only the primary key of that column. Basically I am trying to return the id values of all records that contain the specified keyword
My db_block_buffer size is 2443. I recently doubled it to 4886 and saw no change in performance. db_block_size is 8K
Thanks again for any suggestions!
Mike
> 25,000/200,000 = 12.5 % => the optimizer considers a
> fast full scan or a full
> table scan to be more efficient than index access
> (as a rule of thumb, if you
> expect your result set more than 10 % of the
> complete table FFS or FTS is
> faster). But you hint him the wrong direction by
> first_rows.
>
> So please try with /*+ all_rows */ hint.
>
> Next, how big are the text columns on an average? If
> they are big enough I would
> not consider 40 s to be slow, e. g. if they are of 4
> k average (I assume that
> because otherwise you should consider to use
> varchar2 (4000) or raw (2000)) size
> you are going to retrieve 100 MB in 40 s which makes
> 2.5 MB/s - compare that to
> the speed your system can copy files from disk to
> main memory. I should say this
> is fast.
>
> Martin
>
>
>
> Mike Itagaki wrote:
> >
> > Hello Martin
> >
> > Thanks for the quick reply. I've answered your
> > questions below.
> >
> > > you wrote that your query returns a large number
of
> > > rows. How many?
> >
> > About 25,000 rows are returned, as shown in the
count
> > value.
> >
> > > How big is your db_block_buffers-parameter?
> >
> > db_block_buffers is 2443, and db_block size is
8192.
> > All together about 20M. Do you suspect this is the
> > problem?
> >
> > > Leave away the first_rows-hint because you do a
> > > count and therefore you force a
> > > complete scan of the index.
> >
> > OK. Actually, I only used the COUNT function to
> > highlight the problem with the text index. In my
> > actual query I don't use COUNT but return the
actual
> > row ids.
> >
> > > What does a plain
> > >
> > > select --+ all_rows
> > > count (url_id)
> > > from
> > > urls
> > > where text like '%fortran%'
> > > /
> > >
> > > yield?
> >
> > The text column is of type CLOB, so the LIKE
keyword
> > won't work with it.
> >
> > Any ideas you have would be greatly appreciated!
> > Thanks for taking the time to read my post.
> >
> > Mike
> >
> > > >
> > > > Hello,
> > > >
> > > > I've been grappling with a tuning an
intermedia search query and have
> > > > run into a roadblock. No matter what I do, I
can't
seem to get the query
> > > > time below 40-60 seconds for keywords that
return
a large number of rows.
> > > > This will be used on the web, so I need fast
results, although I don't
> > > > expect usage to be that heavy. There are
220,000
records in the database. I
> > > > am running NT 4 with Oracle 8i 8.1.6, single
466
Celeron processor, 384 MB
> > > > RAM, 2 7200 RPM drives.
> > > >
> > > > I've experimented with join order, made sure
statistics are current on all
> > > > tables, made sure all equijoins are done with
indexes and that the indexes
> > > > are freshly rebuilt, but to no avail. The text
index is freshly rebuilt with
> > > > parameters('optimize full'). The table data
and
the text index are on
> > > > different drives. The problem seems to be with
the
CONTAINS clause itself.
> > > > All on its own it takes as much as a minute
with
some keywords. (see below)
> > > > The FIRST_ROWS hint doesn't seem to help very
much. (see below)
> > > >
> > > > My guess is that the problem is being caused
by
the physical I/Os. For one
> > > > keyword there are more than 15,000 physical
I/Os.
This suspicion is
> > > > supported by Performance Monitor, which goes
to
70% disk time on the drive
> > > > that contains the table data and 80% processor
time during a query. I'm
> > > > getting desperate, and am considering
resorting to
upgrading hardware as a
> > > > final resort (having given up on the idea of
optimizing SQL queries, DB
> > > > structure, and the like).
> > > >
> > > > Possible brute force solutions I can think of
are
> > > > RAID
> > > > Caching the text index table in RAM (thats a
lot
of ram!)
> > > > Parallel Server
> > > >
> > > > Can anyone think of a more elegant solution,
or
can identify an error in my
> > > > logic?
> > > >
> > > > Thanks so much,
> > > >
> > > > Mike
> > > >
> > > > //////////////////// Simple call to CONTAINS
> > > > SQL> select count(url_id) from urls WHERE
contains(text, 'fortran', 1) > 0;
> > > >
> > > > COUNT(URL_ID)
> > > > -------------
> > > > 24515
> > > >
> > > > Elapsed: 00:00:46.37
> > > >
> > > > Execution Plan
> > > >
> > >
> >
>