Comparing Cardinality Estimates with Results

From: Larry Elkins <elkinsl_at_verizon.net>
Date: Fri, 15 Nov 2013 16:34:43 -0600
Message-id: <002b01cee252$e168f1f0$a43ad5d0$_at_net>

Stephane Faroult, who is having trouble posting to the list right now, asked me to pose this question on his behalf:

I am currently working on an application of hell (Entity/Attribute/Value all over the place, unpartitioned tables in the 200 million to 3 billion row range, blanket indexing, cluster factor usually very low, hard-coded queries with long lists of identifiers, cursor_sharing set to force ... and the bulk of the work of course done in Java). In a few cases the optimizer goes south, which is understandable and seems to be more or less corrected by more aggressive dynamic sampling (the application isn't too demanding on Oracle CPU). However, I was curious to compare, globally, cardinality estimates with actual results. I wrote the following query, which is a kind of first draft (warning: requires the performance pack, I query the dba_hist tables), in which I try to collect the "last level" of cardinality estimate as reported in the plan (I know, rather meaningless with aggregates, but that's the best I have come up with so far; improvements are welcome) for queries during a precise interval. I have set three categories, "Spot on" when the order of magnitude is the same for the estimate and the actual average, "Off" when there is one order of magnitude of difference (either way) and "Gone South" if it's more than that. I was rather surprised to discover that queries were more or less equally distributed between the three categories. I would be curious to know how the optimizer performs, cardinality-wise, in other environments and what could be considered "reasonably stable".

with q as

    (select d.dbid,
            i.instance_number,
            s.snap_id,
            round(s.begin_interval_time, 'HH24') snap_hour
     from dba_hist_snapshot s
          cross join


           (select dbid

           from v$database) d
          cross join


           (select instance_number

            from v$instance) i
     where s.dbid = d.dbid
       and s.instance_number = i.instance_number
       and s.begin_interval_time >= to_timestamp('&start_YYYYMMDDHHMI', /*start YYYYMMMDDHH */
                                                 'YYYYMMDDHH24MI')
       and s.end_interval_time <= to_timestamp('&end_YYYYMMDDHHMI', /*end YYYYMMDDHH */
                                                 'YYYYMMDDHH24MI')) select * from (select case actual_cardinality
               when estimated_cardinality then 'Spot on'  
               else case 
                      when abs(round(log(10, greatest(actual_cardinality, 1)
                                             /greatest(estimated_cardinality, 1)))) < 1 then 'Off'
                      else 'Gone south'
                    end
             end as cardinality_estimate,
             count(*) number_of_plans
      from (select s.dbid,
                   s.sql_id,
                   s.plan_hash_value,
                   round(log(10, greatest(avg(rows_processed_delta), 1))) actual_cardinality,
                   round(avg(rows_processed_delta)) as actual
            from q
                 inner join dba_hist_sqlstat s
                    on s.dbid = q.dbid
                   and s.instance_number = q.instance_number
                   and s.snap_id = q.snap_id
            group by s.dbid,
                     s.sql_id,
                     s.plan_hash_value) a
           inner join (select x.sql_id,
                              x.plan_hash_value,
                              round(log(10, greatest(sum(x.cardinality), 1))) as estimated_cardinality,
                              sum(x.cardinality) as estimate
                       from (select dbid,
                                    sql_id,
                                    plan_hash_value,
                                    cardinality,
                                    rank() over (partition by dbid,
                                                              plan_hash_value
                                                 order by id) rnk
                             from dba_hist_sql_plan
                             where cardinality is not null) x
                       where x.rnk = 1
                       group by x.sql_id,
                                x.plan_hash_value) b
              on b.sql_id = a.sql_id
             and b.plan_hash_value = a.plan_hash_value
     group by case actual_cardinality
                when estimated_cardinality then 'Spot on'
                else case 
                       when abs(round(log(10, greatest(actual_cardinality, 1)
                                               /greatest(estimated_cardinality, 1)))) < 1 then 'Off'
                       else 'Gone south'
                     end
               end)
order by case cardinality_estimate
           when 'Spot on' then 1
           when 'Off' then 2
           else 3
         end

Larry G. Elkins
elkinsl_at_verizon.net
Cell: 214.695.8605

--
http://www.freelists.org/webpage/oracle-l

Received on Fri Nov 15 2013 - 23:34:43 CET

This message: [ Message body ]
Next message: Hans Forbrich: "Re: Practice of using chopt to disable database options"
Previous message: David Fitzjarrell: "Re: Can an MV be converted into a "normal" table?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message