RE: Cardinality Issues

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Thu, 29 Aug 2013 09:00:39 -0400
Message-ID: <00c801cea4b7$c399f3a0$4acddae0$_at_rsiz.com>



Argh. Since your join predicate is location_id, that is the column that should be indexed (on table BILLING).

An index on BILLING.location_name would only have an effect if you included a.location_name in your predicates or possibly had extended statistics on the columns location_id and location_name.

But your current estimate of 100K is correct, since as written it has to do a full table scan to project the result set.

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Rajiv Iyer
Sent: Thursday, August 29, 2013 7:44 AM
To: oracle-l_at_freelists.org
Subject: Cardinality Issues

Hi All.
I have a problem with a SQL which is not giving a good cardinality estimate. The original SQL is pretty complex with many joins but I have managed to replicate the issue with a simple test case. It's as follows:
---Create tables BILLING & LOCATION

create table BILLING as
with VW as
(
select rownum id from dual connect by level <= 1000 )
select
rownum BILL_ID,
'Y' ACTIVE,
case
when rownum <= 50000 then
996
else
case when MOD(rownum,50000)=0 then
990
when MOD(rownum,5000)=0 then
991
when MOD(rownum,500)=0 then
992
when MOD(rownum,50)=0 then
993
when MOD(rownum,5)=0 then
994
else
995
end
end location_id
from VW, VW
where rownum <= 100000;

create table LOCATION as
select location_id, 'X-'||rownum LOCATION_NAME from ( select distinct LOCATION_ID from BILLING
) order by location_id;

---gather stats

EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'BILLING'); EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'LOCATION');
--Visualize the data

SELECT a.LOCATION_ID,b.location_name,COUNT(*) FROM BILLING A, LOCATION B WHERE A.LOCATION_ID = B.LOCATION_ID GROUP BY A.LOCATION_ID,B.LOCATION_NAME ORDER BY 3;
LOCATION_ID LOCATION_NAME COUNT(*)
----------- ------------- ----------

        990 X-4                    1
        991 X-3                    9
        992 X-2                   90
        993 X-1                  900
        994 X-6                 9000
        995 X-5                40000
        996 X-7                50000

So we know that BILLING table has 9 records for location 'X-3'. But the estimated rows in the explain plan is not in tune with this. It shows a value of 100K

EXPLAIN PLAN FOR
select * from BILLING a, LOCATION B
where a.location_id = b.location_id and b.location_name = 'X-3';

Plan hash value: 725447293



---
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)|
Time     |

----------------------------------------------------------------------------
---
| 0 | SELECT STATEMENT | | 14286 | 265K| 70 (5)| 00:00:01 | |* 1 | HASH JOIN | | 14286 | 265K| 70 (5)| 00:00:01 | |* 2 | TABLE ACCESS FULL| LOCATION | 1 | 8 | 3 (0)|
00:00:01 |
| 3 | TABLE ACCESS FULL| BILLING | 100K| 1074K| 65 (2)| 00:00:01 |

---

Predicate Information (identified by operation id):


   1 - access("A"."LOCATION_ID"="B"."LOCATION_ID")    2 - filter("B"."LOCATION_NAME"='X-3')

This is a problem when there are many joins in the query and we don't get a good plan.
I have considered adding hints to get us the desired plan. But I would like to know if there are any alternatives.

Thanks,
Rajiv

--

http://www.freelists.org/webpage/oracle-l

--

http://www.freelists.org/webpage/oracle-l Received on Thu Aug 29 2013 - 15:00:39 CEST

Original text of this message