Asif Momen
Views expressed here are solely that of my own. Please make sure that you test the code/queries that appear on my blog before applying them on a production environment.
Updated: 1 hour 10 min ago
Analytic Functions: The most ignored ones
Dear Readers,
Analytic Functions were first introduced in Oracle 8i, way back in 1999. Tom Kyte’s book “Expert One-on-One Oracle” has a dedicated chapter on this topic. Not only Tom’s book, any book on SQL (I have seen so far) has a separate chapter written on Analytic Functions. Yet, the developer community seems to be paying least attention in trying to understand and use them. I think before educating students, tutors should be trained to emphasize the importance of Analytic Functions in day to day life of a developer.
Anyways, here’s a similar case on one of our very busy OLTP database.
This SQL statement topped in the AWR report consuming nearly 84% of database time and was executed more than 1200 times during a 45-minutes AWR report.
A query is required to fetch TECH_PAC_ID for a customer along with the number of records for that customer.
Here’s the original query, its Explain Plan and Statistics:
SQL> set autotrace on
SQL> SELECT Tech_PAC_ID, Users_USER_CD, TOT_PAC
2 FROM (SELECT Tech_PAC_ID, Users_USER_CD
3 FROM Tech, Users
4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
5 AND Users_USER_CD = Tech_ETI_USER_CD) DET,
6 (SELECT COUNT(*) TOT_PAC
7 FROM Tech, Users
8 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
9 AND Users_USER_CD = Tech_ETI_USER_CD) TOT;
TECH_PAC_N USERS_USER_CD TOT_PAC
---------- -------------------- ----------
236XXX123 ABX65842 5
Elapsed: 00:00:00.06
Execution Plan
----------------------------------------------------------
Plan hash value: 3349818188
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 54 | 1280 (1)| 00:00:16 |
| 1 | MERGE JOIN CARTESIAN | | 1 | 54 | 1280 (1)| 00:00:16 |
|* 2 | TABLE ACCESS BY INDEX ROWID | TECH | 1 | 18 | 4 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 41 | 640 (1)| 00:00:08 |
|* 4 | TABLE ACCESS FULL | USERS | 1 | 23 | 636 (1)| 00:00:08 |
|* 5 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
| 6 | BUFFER SORT | | 1 | 13 | 1276 (1)| 00:00:16 |
| 7 | VIEW | | 1 | 13 | 640 (1)| 00:00:08 |
| 8 | SORT AGGREGATE | | 1 | 104 | | |
|* 9 | TABLE ACCESS BY INDEX ROWID| TECH | 1 | 12 | 4 (0)| 00:00:01 |
| 10 | NESTED LOOPS | | 1 | 104 | 640 (1)| 00:00:08 |
|* 11 | TABLE ACCESS FULL | USERS | 1 | 92 | 636 (1)| 00:00:08 |
|* 12 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("TECH_STS"='Y')
4 - filter("USERS_USER_CD"='ABX65842')
5 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
9 - filter("TECH_STS"='Y')
11 - filter("USERS_USER_CD"='ABX65842')
12 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4671 consistent gets
0 physical reads
0 redo size
558 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
Instead of writing two queries to fetch TECH_PAC_ID and COUNT(USERS_USER_CD) we can achieve the same result by using Analytic Functions. The new query not only returns the required result but is also less resource intensive and more database-friendly.
Here goes the enhanced query using Analytic Function:
SQL> SELECT Tech_PAC_ID, Users_USER_CD,
2 count(1) over (partition by Users_USER_CD) TOT_PAC
3 FROM Tech, Users
4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
5 AND Users_USER_CD = Tech_ETI_USER_CD
6 GROUP BY Tech_PAC_ID, Users_USER_CD;
TECH_PAC_N USERS_USER_CD TOT_PAC
---------- -------------------- ----------
236XXX123 ABX65842 5
Elapsed: 00:00:00.04
Execution Plan
----------------------------------------------------------
Plan hash value: 1328229640
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 41 | 641 (1)| 00:00:08 |
| 1 | WINDOW BUFFER | | 1 | 41 | 641 (1)| 00:00:08 |
| 2 | SORT GROUP BY | | 1 | 41 | 641 (1)| 00:00:08 |
|* 3 | TABLE ACCESS BY INDEX ROWID| TECH | 1 | 18 | 4 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 41 | 640 (1)| 00:00:08 |
|* 5 | TABLE ACCESS FULL | USERS | 1 | 23 | 636 (1)| 00:00:08 |
|* 6 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TECH_STS"='Y')
5 - filter("USERS_USER_CD"='ABX65842')
6 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2335 consistent gets
0 physical reads
0 redo size
558 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL>
The new query seems to be doing a good job. The “consistent gets” have dropped from 4671 to 2335. That’s nearly half of the original, the reason being: instead of hitting the table twice, we are getting the work done in one hit. But still, the consistent gets seems to be reasonably high.
Although ROWS=1 is being shown in the Explain Plan for USERS table, but optimizer is spending most of its time doing a Full Table Scan at this step. Adding an index on USERS_USER_CD column of USERS table should do the trick.
After adding the index on USERS (USERS_USER_CD) column, the query seems to be flying.
SQL> SELECT Tech_PAC_ID, Users_USER_CD,
2 count(1) over (partition by Users_USER_CD) TOT_PAC
3 FROM Tech, Users
4 WHERE Tech_STS = 'Y' AND Users_USER_CD = UPPER('ABX65842' )
5 AND Users_USER_CD = Tech_ETI_USER_CD
6 GROUP BY Tech_PAC_ID, Users_USER_CD;
TECH_PAC_N USERS_USER_CD TOT_PAC
---------- -------------------- ----------
236XXX123 ABX65842 5
Execution Plan
----------------------------------------------------------
Plan hash value: 1753916289
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 7 (15)| 00:00:01 |
| 1 | WINDOW BUFFER | | 1 | 33 | 7 (15)| 00:00:01 |
| 2 | SORT GROUP BY | | 1 | 33 | 7 (15)| 00:00:01 |
|* 3 | TABLE ACCESS BY INDEX ROWID | TECH | 1 | 15 | 4 (0)| 00:00:01 |
| 4 | NESTED LOOPS | | 1 | 33 | 6 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| USERS | 1 | 18 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | USERS_MUB_USER_CD_IDX | 1 | | 1 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | TECH_USER_CD | 3 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TECH_STS"='Y')
6 - filter("USERS_USER_CD"='ABX65842')
7 - access("USERS_USER_CD"="Tech_ETI_USER_CD")
filter("TECH_ETI_USER_CD" IS NOT NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6 consistent gets
1 physical reads
0 redo size
558 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
This time merely 6 “consistent gets” were required. It’s a 99%+ reduction in overall consistent gets.
As I mentioned earlier in the post, I see SQL statements similar to this one over and over again. Analytic function is a nice alternative to the traditional way of writing these types of queries that works extremely well and provides the performance needed for high numbers of executions in a high data volume environment.
More information on Analytic Functions can (should) be obtained from here:
Oracle Database SQL Reference 10g Release 2
On Top-n and Pagination Queries by Tom Kyte
AskTom
One Analytic Function Can do More Than a 1000 Lines of Code by Alex Nuitjen
Happy reading.Syntax Highlighter
Dear Readers,
Most of you might have experienced difficulies in placing code snippets in a nice formatted way on Blogger.com. I have been previously using Greg Houston's Code formatter, but when you have pipes ("|") in the code like in Explain Plan, then Blogger fails to display the code correctly.
But, recently I stumbled on one of the Karen Morton's post where she discusses a way to format your code snippets on blogger. Here are more details on highlightling your code in blogger.com using Syntax Highlighter.
The code below has been formatted using SyntaxHighlighter:
SQL> conn scott/tiger Connected. SQL> select * from tab; TNAME TABTYPE CLUSTERID ------------------------------ ------- ---------- DEPT TABLE EMP TABLE BONUS TABLE SALGRADE TABLE TEST_RUN TABLE PLAN_TABLE TABLE EMP_VIEW VIEW EMP2 TABLE 8 rows selected. SQL>More interesting stuff can by found on Fahd Shariff's blog and Morten Lyhr's blog. Happy formating !!!
Dynamic Sampling and Table Partitions
Dynamic Sampling was first introduced in Oracle 9i Release 2 and the purpose is to improve query performance by determining more accurate estimates for predicate selectivity and statistics for tables and indexes.
Dynamic sampling is to be used predominantly with un-analyzed tables. Before hard parsing a query Oracle determines if dynamic sampling would improve query performance. If so, then the optimizer issues recursive SQL statements to estimate the necessary statistics dynamically.
Here’s an example of dynamic sampling:
Operating System: Windows XP
Database: Oracle 10g Release 2 (10.2.0.4)
SQL> create table t (a number); Table created. SQL> set autotrace traceonly exp SQL> select * from t; Execution Plan ---------------------------------------------------------- Plan hash value: 1601196873 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement SQL> set autotrace offIts clear from the above Execution Plan that dynamic sampling was performed. But it seems dynamic sampling is not partition friendly or rather dynamic sampling is partition unaware. What I mean is if you have a partitioned table and at least one partition is analyzed then dynamic sampling is NOT performed on any of the un-analyzed partitions. This is because Optimizer looks at columns like NUM_ROWS of DBA_TABLES view and considers that the table is analyzed. Let us create a partitioned table with a local index.
SQL> create table part(
2 sno number,
3 dt date,
4 remark varchar2(30))
5 partition by range(dt) (
6 partition p1 values less than (to_date('01-02-2008', 'dd-mm-yyyy')),
7 partition p2 values less than (to_date('01-03-2008', 'dd-mm-yyyy')),
8 partition p3 values less than (to_date('01-04-2008', 'dd-mm-yyyy')),
9 partition p4 values less than (to_date('01-05-2008', 'dd-mm-yyyy'))
10 );
Table created.
SQL>
SQL> create index part_idx on part(dt, sno) local;
Index created.
SQL>
Now, populate all but the last partition with test data.
SQL> insert into part
2 select level,
3 to_date('01-01-2008', 'dd-mm-yyyy') + level - 1 ,
4 'test'
5 from dual
6 connect by level <=90;
90 rows created.
SQL>
SQL> commit;
Commit complete.
SQL>
Let us now collect statistics for partitions P1, P2, and P3.
SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p1', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p2', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', - > partname => 'p3', estimate_percent => null, - > method_opt => 'for all columns size 1', cascade => true); PL/SQL procedure successfully completed. SQL> SQL> select table_name, num_rows 2 from user_tables 3 where table_name ='PART'; TABLE_NAME NUM_ROWS ------------------------------ ---------- PART 90 SQL> SQL> SQL> select partition_name, num_rows 2 from user_tab_partitions 3 where table_name = 'PART'; PARTITION_NAME NUM_ROWS ------------------------------ ---------- P1 31 P2 29 P3 30 P4 SQL> SQL> SQL> select partition_name, num_rows 2 from user_ind_partitions 3 where index_name = 'PART_IDX'; PARTITION_NAME NUM_ROWS ------------------------------ ---------- P1 31 P2 29 P3 30 P4 SQL>It’s time to test whether optimizer considers dynamic sampling or not.
SQL> set autotrace traceonly exp
SQL>
SQL> select *
2 from part
3 where dt = to_date('01-04-2008', 'dd-mm-yyyy')
4 and sno = 1;
Execution Plan
----------------------------------------------------------
Plan hash value: 2633488982
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 |
|* 2 | TABLE ACCESS FULL | PART | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("DT"=TO_DATE(' 2008-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"SNO"=1)
SQL>
SQL> set autotrace off
SQL>
No! Optimizer thinks that it has enough information available at hand to with it rather than sampling dynamically.
Now, let us populate the unanalyzed partition with data and re-run the same query.
SQL> insert into part select 1, to_date('01-04-2008', 'dd-mm-yyyy'), 'test'
2 from dual connect by level <=1000;
1000 rows created.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> set autotrace traceonly exp
SQL>
SQL>
SQL> select * from part
2 where dt = to_date('01-04-2008', 'dd-mm-yyyy')
3 and sno = 1;
Execution Plan
----------------------------------------------------------
Plan hash value: 2633488982
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 |
|* 2 | TABLE ACCESS FULL | PART | 1 | 16 | 2 (0)| 00:00:01 | 4 | 4 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("DT"=TO_DATE(' 2008-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"SNO"=1)
SQL>
SQL> set autotrace off
Well, the optimizer took the same plan because statistics were not up to date. Consider gathering statistics on the partition in concern and execute the same query again.
SQL> exec dbms_stats.gather_table_stats(user, tabname => 'part', -
> partname => 'p4', estimate_percent => null, -
> method_opt => 'for all columns size 1', cascade => true);
PL/SQL procedure successfully completed.
SQL>
SQL> set autotrace traceonly exp
SQL>
SQL>
SQL> select * from part
2 where dt = to_date('01-04-2008', 'dd-mm-yyyy')
3 and sno = 1;
Execution Plan
----------------------------------------------------------
Plan hash value: 2633488982
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 16000 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1000 | 16000 | 3 (0)| 00:00:01 | 4 | 4 |
|* 2 | TABLE ACCESS FULL | PART | 1000 | 16000 | 3 (0)| 00:00:01 | 4 | 4 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("DT"=TO_DATE(' 2008-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"SNO"=1)
SQL>
SQL> set autotrace off
SQL>
Well this time optimizer had updated statistics which gives a better and clear picture to pick the most appropriate access path. So, optimizer thinks “TABLE ACCESS FULL” on partition P4 is appropriate and which makes sense too.
I have purposely populated partition P4 with duplicate data so that we see a difference in optimizer access path.
Finally, the conclusions are:
• Dynamic sampling ignores partition level statistical information.
• Always update statistics after data load process, or assign default statistics to unanalyzed partition.
Happy reading.Bugs in REMAINDER Function (Updated)
Dear all,
Regarding Bugs in REMAINDER Function, I have opened a call with Oracle Support and they say "REMAINDER function works as intended and this is a documentation issue which requires minor clarification".
Oracle documentation has modified the text as follows:
"If n1 != 0, then the remainder is n2 - (n1*N) where N is the integer nearest n2/n1. If n2/n1 equals x.5, then N is the nearest even integer."
A bug (7236077) has also been logged.
Regards
Funny “ORA-00942: table or view does not exist” Error Message
Dear all,
We all know when “ORA-00942” error message is returned. For instance, if I try to query a non-existent table then Oracle joyfully returns this errors message.
SQL> select * from funny;
select * from funny
*
ERROR at line 1:
ORA-00942: table or view does not exist
“ORA-00942” error message is thrown on our face whenever we try to perform any action like CREATE, ALTER, DROP, INSERT, UPDATE, DELETE ... against a non-existent table. But this error is sometimes misleading, like the one below:
SQL> create table t(a number);
Table created.
SQL> create view v as select * from t;
View created.
SQL> drop table v;
drop table v
*
ERROR at line 1:
ORA-00942: table or view does not exist
Error message reports “table or view does not exist”, but definitely a view named “V” exists in the same schema.
Wouldn’t it be more appropriate if Oracle reports “table does not exist”?MIN and MAX Functions in a Single Query are Disastrous
Dear Readers,
I would like to discuss a very interesting point about indexes in this post. When we are interested in finding out the minimum value of an indexed column, instead of reading entire table or the index, Oracle intelligently uses the index to navigate to the first index leaf block (leftmost index block) and quickly finds the minimum value of an indexed column.
A simple demo proves this:
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for 32-bit Windows: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL>
SQL> drop table t purge;
Table dropped.
SQL>
SQL>
SQL> create table t as select level sno, 'name ' || level name
2 from dual connect by level <= 10000000;
Table created.
SQL>
SQL> create unique index t_idx on t(sno);
Index created.
SQL> exec dbms_stats.gather_table_stats(user, 't');
PL/SQL procedure successfully completed.
SQL> set autotrace on
SQL>
SQL> select min(sno) from t;
MIN(SNO)
----------
1
Execution Plan
----------------------------------------------------------
Plan hash value: 2683064407
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 9034 (2)| 00:01:49 |
| 1 | SORT AGGREGATE | | 1 | 6 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| T_IDX | 10M| 57M| | |
------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
1 physical reads
0 redo size
411 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
SQL> set autotrace off
SQL>
Although this table contains 10 Million rows, Oracle required only 3 consistent gets to fetch the minimum value. Superb !!!
Similarly, when finding out the maximum value, Oracle reads the last block on the right-hand side of the index structure.
SQL> set autotrace on
SQL> select max(sno) from t;
MAX(SNO)
----------
10000000
Execution Plan
----------------------------------------------------------
Plan hash value: 2683064407
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 9034 (2)| 00:01:49 |
| 1 | SORT AGGREGATE | | 1 | 6 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| T_IDX | 10M| 57M| | |
------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
2 physical reads
0 redo size
411 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> set autotrace off
SQL>
Once again only 3 consistent gets were required to fetch the maximum value.
But things get messy when you use both MIN and MAX functions in the same query. Instead of using same "INDEX FULL SCAN (MIN/MAX)" path to read the left-most block and right-most block to arrive at the minimum and maximum values, Oracle goes with FULL TABLE SCAN. A Full Table Scan on 10 Million rows !!!
SQL> set autotrace on
SQL> select min(sno), max(sno) from t;
MIN(SNO) MAX(SNO)
---------- ----------
1 10000000
Execution Plan
----------------------------------------------------------
Plan hash value: 2966233522
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 9034 (2)| 00:01:49 |
| 1 | SORT AGGREGATE | | 1 | 6 | | |
| 2 | TABLE ACCESS FULL| T | 10M| 57M| 9034 (2)| 00:01:49 |
---------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
32928 consistent gets
11391 physical reads
0 redo size
472 bytes sent via SQL*Net to client
399 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> set autotrace off
SQL>
From this terrible behavior of Oracle Optimizer, What I infer and suggest is:
"Write separate queries to fetch MIN and MAX values instead of combining them into one query".
Bugs in REMAINDER Function
While browsing through SQL Reference documentation (Oracle 10g Release 2), I stopped at REMAINDER function. The REMAINDER returns the remainder of n2 divided by n1 and internally uses ROUND function to arrive at the remainder value.
Oracle documentation also says, "If n1 != 0, then the remainder is n2 - (n1*N) where N is the integer nearest n2/n1."
SQL> Select REMAINDER(501, 5) from dual;
REMAINDER(501,5)
----------------
1
SQL>
But, because of an internal bug, REMAINDER function returns incorrect results when "n2/n1" results in 0.5, 1.5, 2.5, .....
SQL> Select REMAINDER(5, 2) from dual;
REMAINDER(5,2)
--------------
1
SQL>
According to the formula,
n2 - (n1 * N) = 5 - (2 * ROUND(5/2)) = 5 - (2 * 3) = 5 - 6 = -1the value returned should be "-1" where as the REMAINDER function returns "+1". My observation is that, for example "1.5" is getting rounded to "1" instead of "2" and is spoiling the outcome of REMAINDER. I wrote a small PL/SQL block to screen my observation:
SQL> set linesize 1000
SQL> set pages 100
SQL> set serveroutput on
SQL>
SQL> declare
2 n1 number;
3 n2 number;
4 N number;
5
6 sys_rem number;
7 my_rem number;
8 begin
9 for n1 in 1..5 loop
10 --for n2 in 1..50 loop
11 n2 := 1;
12 while n2 <= 20 loop
13 Sys_rem := REMAINDER(n2, n1);
14 N := Round(n2/n1);
15 My_rem := n2 - (n1 * N);
16 If sys_rem <> my_rem then
17 Dbms_output.put_line('n2 : ' || rpad(to_char(n2), 4, ' ') ||
18 ' n1 : ' || to_char(n1) ||
19 ' ' ||
20 ' n2/n1 = ' || rpad(to_char(n2/n1), 4, ' ') ||
21 ' ' ||
22 ' ORCL rem = ' || rpad(to_char(sys_rem), 4, ' ') ||
23 ' ' ||
24 ' Calc Rem [' || rpad(to_char(n2), 4, ' ') ||
25 ' - (' || to_char(n1) || '*' ||
26 rpad(to_char(N), 4, ' ') || ')] = '||
27 to_char(my_rem));
28 End If;
29 n2 := n2 + 0.1;
30 end loop;
31 end loop;
32 end;
33 /
n2 : 2.5 n1 : 1 n2/n1 = 2.5 ORCL rem = .5 Calc Rem [2.5 - (1*3 )] = -.5
n2 : 4.5 n1 : 1 n2/n1 = 4.5 ORCL rem = .5 Calc Rem [4.5 - (1*5 )] = -.5
n2 : 6.5 n1 : 1 n2/n1 = 6.5 ORCL rem = .5 Calc Rem [6.5 - (1*7 )] = -.5
n2 : 8.5 n1 : 1 n2/n1 = 8.5 ORCL rem = .5 Calc Rem [8.5 - (1*9 )] = -.5
n2 : 10.5 n1 : 1 n2/n1 = 10.5 ORCL rem = .5 Calc Rem [10.5 - (1*11 )] = -.5
n2 : 12.5 n1 : 1 n2/n1 = 12.5 ORCL rem = .5 Calc Rem [12.5 - (1*13 )] = -.5
n2 : 14.5 n1 : 1 n2/n1 = 14.5 ORCL rem = .5 Calc Rem [14.5 - (1*15 )] = -.5
n2 : 16.5 n1 : 1 n2/n1 = 16.5 ORCL rem = .5 Calc Rem [16.5 - (1*17 )] = -.5
n2 : 18.5 n1 : 1 n2/n1 = 18.5 ORCL rem = .5 Calc Rem [18.5 - (1*19 )] = -.5
n2 : 1 n1 : 2 n2/n1 = .5 ORCL rem = 1 Calc Rem [1 - (2*1 )] = -1
n2 : 5 n1 : 2 n2/n1 = 2.5 ORCL rem = 1 Calc Rem [5 - (2*3 )] = -1
n2 : 9 n1 : 2 n2/n1 = 4.5 ORCL rem = 1 Calc Rem [9 - (2*5 )] = -1
n2 : 13 n1 : 2 n2/n1 = 6.5 ORCL rem = 1 Calc Rem [13 - (2*7 )] = -1
n2 : 17 n1 : 2 n2/n1 = 8.5 ORCL rem = 1 Calc Rem [17 - (2*9 )] = -1
n2 : 1.5 n1 : 3 n2/n1 = .5 ORCL rem = 1.5 Calc Rem [1.5 - (3*1 )] = -1.5
n2 : 7.5 n1 : 3 n2/n1 = 2.5 ORCL rem = 1.5 Calc Rem [7.5 - (3*3 )] = -1.5
n2 : 13.5 n1 : 3 n2/n1 = 4.5 ORCL rem = 1.5 Calc Rem [13.5 - (3*5 )] = -1.5
n2 : 19.5 n1 : 3 n2/n1 = 6.5 ORCL rem = 1.5 Calc Rem [19.5 - (3*7 )] = -1.5
n2 : 2 n1 : 4 n2/n1 = .5 ORCL rem = 2 Calc Rem [2 - (4*1 )] = -2
n2 : 10 n1 : 4 n2/n1 = 2.5 ORCL rem = 2 Calc Rem [10 - (4*3 )] = -2
n2 : 18 n1 : 4 n2/n1 = 4.5 ORCL rem = 2 Calc Rem [18 - (4*5 )] = -2
n2 : 2.5 n1 : 5 n2/n1 = .5 ORCL rem = 2.5 Calc Rem [2.5 - (5*1 )] = -2.5
n2 : 12.5 n1 : 5 n2/n1 = 2.5 ORCL rem = 2.5 Calc Rem [12.5 - (5*3 )] = -2.5
PL/SQL procedure successfully completed.
SQL>
I have tested this code on Oracle 10g Release 2. Unfortunately, this bug also exists in Oracle 11g.
I have raised a Service Request and also filed a bug with Oracle Support.
Regards
OTN - Discussion Forums in a great new look
Dear Readers,
Morning (GMT+3) when I tried to login to OTN-Discussion forum, it was down. But later when I checked it again, it's with a great new look.
With the new look comes new features. Following are the new features I have noticed:
* Great new look
* Add smiley's to messages
* Spell Check
* Formatting
* Reward points
* Save messages
* Tag messages and
* Abuse alerts
It should be easier for the OTN-newbie's to post their questions in a well formatted and readable format.
Enjoy discussions !!!
Why my index is not used?
Well, this question keeps popping up now and then. Yesterday, one of my colleagues also came up with this question: "Why is it that Oracle is not using index even though I am selecting less than 10% of data?".
We ran the query with autotrace enabled and the execution plan showed a Full Table Scan. This table contains over 29 Million records and by adding the predicate, result set is reduced to 2.3 Million records, which is 8% of total records.
SQL> set autotrace traceonly exp SQL> SELECT * 2 FROM quint_sec_tbl 3 WHERE quint_type = 'XX06FR'; Execution Plan ---------------------------------------------------------- Plan hash value: 3917650069 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3440 | 618K| 454K (1)| 01:30:55 | |* 1 | TABLE ACCESS FULL| QUINT_SEC_TBL | 3440 | 618K| 454K (1)| 01:30:55 | ----------------------------------------------------------------------------------------Cary Millsap's very old (but still valid) article "When to Use an Index" greatly helped me in this regard. This article unveils that Index consideration is based on block selectivity and not on row selectivity. He also defines what Row Selectivity and Block Selectivity are in the article. Row Selectivity: You can define the row selectivity of a given where-clause predicate as the number of rows returned by the predicate (r) divided by the total number of rows in the table (R):
P(r) = r / R
Block Selectivity:
You can define the block selectivity of a given where-clause predicate analogously, as the number of data blocks containing at least one row matching the predicate condition (b) divided by the total number of data blocks below the high-water mark (B):
P(b) = b / B
We can calculate block selectivity and row selectivity using SQL provided in this article. I used this SQL against my query and following are the results:
SQL> @hds
TableOwner : MYPRODUSER
TableName : QUINT_SEC_TBL
ColumnList : QUINT_TYPE
WhereClause:
PageSize : 100
Table blocks below hwm Table rows
(B) (R)
---------------------- ----------------
1,672,704 29,270,757
More:
Block selectivity Block count Row selectivity Row count
QUINT_ (pb = b/B) (b) (pr = r/R) (r)
------ ----------------- -------------- ----------------- ----------------
TT34DV 45.03% 753,277 37.99% 11,120,869
FG76SC 44.47% 743,788 13.67% 4,000,205
LH23Q2 42.78% 715,558 9.44% 2,762,284
XX06FR 42.32% 707,894 8.02% 2,346,846
:
:
:
Output of this SQL is sorted in descending order of block selectivity.
Looking at this output, row selectivity is only 8% but to fetch these 8% of rows Oracle has to visit 42% of blocks (block selectivity). That means, nearly half of the table's blocks contain at least one row for which QUINT_TYPE='XX06FR'. Instead of going through the hard path of Index Access, it’s more efficient for the optimizer to do a Full Table Scan.
So, now we know why the index was ignored and a Full Table Scan was preferred.
P.S.: Due to security reasons Username, Table name, column name and column values are modified.



