Re: Low cardinality in OLTP with high concurrency

From: Randolf Geist <mahrah_at_web.de>
Date: Wed, 25 Mar 2009 03:52:17 -0700 (PDT)
Message-ID: <efcfa6dd-6a0d-4f90-8529-8175b35721da_at_o11g2000yql.googlegroups.com>



On Mar 25, 7:56 am, vipin.sin..._at_gmail.com wrote:
> I am struct with a problem of slow performance of a query. The problem
> is because of low cardinality. A table contains about 1 million rows
> which increases daily. there is a column which contains 16 distinct
> values say AOFA, AOFB,SIGN etc etc.
>
> Where UPPER(Field) LIKE 'AOF%'
>
> It takes to much time. I checked in the table in 1 million rows,
> about .5 million rows match this criteria. I have created a functional
> index on the table with Upper function. This improved the performance
> from 2 mins to 40 sec bt 40 sec is again not acceptable.

Vipin,

if your predicate "Where UPPER(Field) LIKE 'AOF%'" matched 500k records out of 1,000k, i.e. 50% of the table, and you need to visit the table to get some other information then no index is going to help you speed this up. The most efficient access would a full table scan. There are special cases if you can manage to get your query answered with an index-only access where the index might be beneficial.

Because your data in "field" is obviously skewed it would be important that you tell the optimizer about this and make sure that you generate a histogram on that column (e.g. DBMS_STATS.GATHER*STATS (method_opt=>'FOR COLUMNS field SIZE 254'), or in 10g the default SIZE AUTO should be sufficient.

I've recently written a blog note about cost based optimizer estimates and the LIKE pattern matching:
http://oracle-randolf.blogspot.com/2009/03/getting-accurate-cardinality-estimates.html

This way the optimizer can come to more accurate estimates if it is better to use a suitable index in case the LIKE pattern matching is going to be very selective or using a full table scan as in your particular case when matching 50% of the table.

Further advice would require to share the different execution plans you got and the query text. Use EXPLAIN PLAN together with DBMS_XPLAN.DISPLAY function to get a meaningful output.

> So i though of bitmap index. But since it is OLTP where inserts/
> modifications run at night by utilities and selects is done by users
> in Day time, i have to drop this option to avoid any deadlock sort of
> things.

But this is not OLTP? If you have a clear separation between the data loading in the night and read-only during the day then I don't see how you should have a concurrency issue? This would be caused by concurrent insert/updates by multiple sessions, and in that case you very likely can't use bitmap indexes due to the potential locking issues caused by them.

I've outlined some common concurrency issues including bitmap indexes some ago: http://oracle-randolf.blogspot.com/2008/12/concurrency-issues-when-do.html

By the way, it's one of the myths that bitmap indexes are better suited for low number of distinct values columns. As long as you have to visit the table by random access while reading from the index there is no difference between a b*tree and the bitmap index apart from any order differences (clustering_factor for b*tree indexes). The power of bitmap indexes comes from the ability to combine multiple them to effectively answer questions like "gender = 'male' and country = 'USA' and product = 'x' and color = 'RED'" for which each individual predicate is not selective but the combination of them is highly selective. This can be covered by bitmap combine operations allowing to narrow down the number of rows to visit from the table by applying corresponding BITMAP AND operations on the various bitmap indexes before visiting the table.

Regards,
Randolf

Oracle related stuff blog:
http://oracle-randolf.blogspot.com/

SQLTools++ for Oracle (Open source Oracle GUI for Windows): http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/ Received on Wed Mar 25 2009 - 05:52:17 CDT

Original text of this message