Re: Differences in Range Scan Performance between Binary and NLS Indexes
Date: Tue, 6 May 2008 14:26:43 -0700 (PDT)
Message-ID: <f00d4c98-6539-4ac9-a1f0-1809824c1152@q27g2000prf.googlegroups.com>
On May 6, 1:16 pm, Pat <pat.ca..._at_service-now.com> wrote:
> I have a table set with about 340k rows in it. User's execute queries
> against this table for arbitrary search strings which are resolved
> against the name table in the column. The table contains mixed case,
> but the users expect to match their search regardless of case. So, for
> example, both of these should match a search term of "badg":
>
> ARabidBadger
> somebadglowercase
>
> A typical search, for all those entries whose name contains "badg"
> would look like this:
>
> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM rnum
> FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0 WHERE
> lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
> a where ROWNUM <= 250) WHERE rnum > 0
>
> To make this query efficient, I added a covering index
>
> create index cmdb_ci_lower_name_sys_id on cmdb_ci (lower("name"),
> "sys_id");
>
> And our query plan looks like this and completes in about 120 ms. This
> was worst case since only 1 row was returned. Short search strings
> like "b" will hit their stop key quickly and exit after finding 250
> matches without having to exhaust the index.
>
> Still, worst case, this query does a full index scan in about 120ms;
>
> SQL> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM
> rnum FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0 WHERE
> lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
> a where ROWNUM <= 250) WHERE rnum > 0;
>
> Elapsed: 00:00:00.12
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 424525705
>
> ------------------------------------------------------------------------------------------------
> | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
> Time |
> ------------------------------------------------------------------------------------------------
> | 0 | SELECT STATEMENT | | 250 | 11750 | 46 (0)|
> 00:00:01 |
> |* 1 | VIEW | | 250 | 11750 | 46 (0)| 00:00:01 |
> |* 2 | COUNT STOPKEY | | | | | |
> | 3 | VIEW | | 251 | 8534 | 46 (0)| 00:00:01
> |
> |* 4 | INDEX FULL SCAN| CMDB_CI_LOWER_NAME_SYS_ID | 16997 |
> 813K| 46 (0)| 00:00:01 |
> ------------------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 1 - filter("RNUM">0)
> 2 - filter(ROWNUM<=250)
> 4 - filter(LOWER("name") LIKE '%badge%')
>
> Statistics
> ----------------------------------------------------------
> 141 recursive calls
> 0 db block gets
> 2964 consistent gets
> 0 physical reads
> 0 redo size
> 439 bytes sent via SQL*Net to client
> 400 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 1 rows processed
>
> Recently though, we started storing NLS characters in this column
> (european customers), so we'll have values like:
>
> badger
> bädger <-- note the accented a
>
> The users want both of the above to collate next to each other.
>
> So we, not unnaturally, put the database in linguisitc mode:
>
> alter session set nls_sort=german;
> alter session set nls_comp=linguistic;
>
> Then we issue the same query. The query plan here involves a sort, but
> the query returns a grand total of 1 record, so the sort isn't
> material to the response time.
...
> 1) Am I doing something wrong here? Why does putting the system into
> NLS_SORT and NLS_COMP make this big a difference on performance?
> 2) Is there an index I should add here that would allow some form of
> efficient querying across this set? I know I'm going to have to do a
> full scan here (contains queries being what they are), but why is my
> full scan 10X slower in linguistic mode?
> 3) Is there some other recommended approach to getting linguistic
> collation working efficiently? Shadow columns aren't really practical
> here (and even then I'm not sure what I'd put in the shadow unless I
> wrote my own german to english unaccenter).
>
> Any help would be much appreciated.
See metalink Note:30779.1
"Setting NLS_SORT to anything other than BINARY causes a sort to use a full table scan, regardless of the path chosen by the optimizer. BINARY is the exception because indexes are built according to a binary order of keys. Thus the optimizer can use an index to satisfy the ORDER BY clause when NLS_SORT is set to BINARY. If NLS_SORT is set to any linguistic sort, the optimizer must include a full table scan and a full sort in the execution plan. "
Things may be different in different versions, I wouldn't know.
jg
-- @home.com is bogus. http://forums.oracle.com/forums/thread.jspa?messageID=2507441�Received on Tue May 06 2008 - 16:26:43 CDT