Re: Differences in Range Scan Performance between Binary and NLS Indexes

From: joel garry <joel-garry_at_home.com>
Date: Tue, 6 May 2008 14:26:43 -0700 (PDT)
Message-ID: <f00d4c98-6539-4ac9-a1f0-1809824c1152@q27g2000prf.googlegroups.com>

On May 6, 1:16�pm, Pat <pat.ca..._at_service-now.com> wrote:
> I have a table set with about 340k rows in it. User's execute queries
> against this table for arbitrary search strings which are resolved
> against the name table in the column. The table contains mixed case,
> but the users expect to match their search regardless of case. So, for
> example, both of these should match a search term of "badg":
>
> ARabidBadger
> somebadglowercase
>
> A typical search, for all those entries whose name contains "badg"
> would look like this:
>
> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM rnum
> FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0 �WHERE
> lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
> a where ROWNUM <= 250) WHERE rnum > 0
>
> To make this query efficient, I added a covering index
>
> create index cmdb_ci_lower_name_sys_id on cmdb_ci (lower("name"),
> "sys_id");
>
> And our query plan looks like this and completes in about 120 ms. This
> was worst case since only 1 row was returned. Short search strings
> like "b" will hit their stop key quickly and exit after finding 250
> matches without having to exhaust the index.
>
> Still, worst case, this query does a full index scan in about 120ms;
>
> SQL> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM
> rnum FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0 �WHERE
> lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
> a where ROWNUM <= 250) WHERE rnum > 0;
>
> Elapsed: 00:00:00.12
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 424525705
>
> ---------------------------------------------------------------------------�---------------------
> | Id �| Operation � � � � �| Name � � � � � � � � � � �| Rows �| Bytes | Cost (%CPU)|
> Time � � |
> ---------------------------------------------------------------------------�---------------------
> | � 0 | SELECT STATEMENT � | � � � � � � � � � � � � � | � 250 | 11750 | � �46 � (0)|
> 00:00:01 |
> |* �1 | �VIEW � � � � � � �| � � � � � � � � � � � � � | � 250 | 11750 | � �46 � (0)| 00:00:01 |
> |* �2 | � COUNT STOPKEY � �| � � � � � � � � � � � � � | � � � | � � � | � � � � � �| � � � � �|
> | � 3 | � �VIEW � � � � � �| � � � � � � � � � � � � � | � 251 | �8534 | � �46 � (0)| 00:00:01
> |
> |* �4 | � � INDEX FULL SCAN| CMDB_CI_LOWER_NAME_SYS_ID | 16997 |
> 813K| � �46 � � �(0)| 00:00:01 |
> ---------------------------------------------------------------------------�---------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> � �1 - filter("RNUM">0)
> � �2 - filter(ROWNUM<=250)
> � �4 - filter(LOWER("name") LIKE '%badge%')
>
> Statistics
> ----------------------------------------------------------
> � � � � 141 �recursive calls
> � � � � � 0 �db block gets
> � � � �2964 �consistent gets
> � � � � � 0 �physical reads
> � � � � � 0 �redo size
> � � � � 439 �bytes sent via SQL*Net to client
> � � � � 400 �bytes received via SQL*Net from client
> � � � � � 2 �SQL*Net roundtrips to/from client
> � � � � � 0 �sorts (memory)
> � � � � � 0 �sorts (disk)
> � � � � � 1 �rows processed
>
> Recently though, we started storing NLS characters in this column
> (european customers), so we'll have values like:
>
> badger
> b�dger <-- note the accented a
>
> The users want both of the above to collate next to each other.
>
> So we, not unnaturally, put the database in linguisitc mode:
>
> alter session set nls_sort=german;
> alter session set nls_comp=linguistic;
>
> Then we issue the same query. The query plan here involves a sort, but
> the query returns a grand total of 1 record, so the sort isn't
> material to the response time.

...

> 1) Am I doing something wrong here? Why does putting the system into
> NLS_SORT and NLS_COMP make this big a difference on performance?
> 2) Is there an index I should add here that would allow some form of
> efficient querying across this set? I know I'm going to have to do a
> full scan here (contains queries being what they are), but why is my
> full scan 10X slower in linguistic mode?
> 3) Is there some other recommended approach to getting linguistic
> collation working efficiently? Shadow columns aren't really practical
> here (and even then I'm not sure what I'd put in the shadow unless I
> wrote my own german to english unaccenter).
>
> Any help would be much appreciated.

See metalink Note:30779.1

"Setting NLS_SORT to anything other than BINARY causes a sort to use a full table scan, regardless of the path chosen by the optimizer. BINARY is the exception because indexes are built according to a binary order of keys. Thus the optimizer can use an index to satisfy the ORDER BY clause when NLS_SORT is set to BINARY. If NLS_SORT is set to any linguistic sort, the optimizer must include a full table scan and a full sort in the execution plan. "

Things may be different in different versions, I wouldn't know.

--
@home.com is bogus.
http://forums.oracle.com/forums/thread.jspa?messageID=2507441&#2507441

Received on Tue May 06 2008 - 16:26:43 CDT

This message: [ Message body ]
Next message: Pat: "Re: Differences in Range Scan Performance between Binary and NLS Indexes"
Previous message: Pat: "Differences in Range Scan Performance between Binary and NLS Indexes"
In reply to: Pat: "Differences in Range Scan Performance between Binary and NLS Indexes"
Next in thread: Pat: "Re: Differences in Range Scan Performance between Binary and NLS Indexes"
Reply: Pat: "Re: Differences in Range Scan Performance between Binary and NLS Indexes"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message