Re: [SPAM] RE: stats not playing nice

From: Jack van Zanen <jack_at_vanzanen.com>
Date: Tue, 28 Feb 2017 08:27:03 +1100
Message-ID: <CAFeFPA-OFJw+6L8w6odoAJTcrpWWXWML00_Pq_vC1J5=0S+eNw_at_mail.gmail.com>



Hi Mark,

Although creative, I am not sure I like the idea of nulling a field when status is actually complete.
I have to give that one some bouncing of my colleagues

Thanks

Jack van Zanen



This e-mail and any attachments may contain confidential material for the sole use of the intended recipient. If you are not the intended recipient, please be aware that any disclosure, copying, distribution or use of this e-mail or any attachment is prohibited. If you have received this e-mail in error, please contact the sender and delete all copies. Thank you for your cooperation

On Tue, Feb 28, 2017 at 3:06 AM, Mark W. Farnham <mwf_at_rsiz.com> wrote:

> And: If you can arrange for "complete" to be stored in the database as
> NULL, then you win. Just because Oracle cannot assign a value to NULL in
> general does not mean you cannot. Then of course you can easily find each
> distinct value other than complete very quickly via the index and the index
> is tiny (possibly after one rebuild, since it currently has a lot of space
> taken up by "complete.")
>
> mwf
>
> -----Original Message-----
> From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org]
> On Behalf Of Jonathan Lewis
> Sent: Monday, February 27, 2017 9:44 AM
> To: oracle-l_at_freelists.org
> Subject: Re: stats not playing nice
>
>
>
> A couple of points:
>
> I tend to use simple numbers that follow the pattern of the actual
> distribution rather than an exact copy of the counts at one moment in time,
> so for your data I might use (100, 200, 1, 1200,1800,222000) - which
> totals 225301.
>
> I don't have to worry too much about the actual values compared to the
> number of rows in the table because the optimizer will scale up the
> (apparent) sample size in the histogram to the number of rows it thinks
> are in the table.
> If you go for very large numbers in the histogram and the num_rows in the
> table is smaller than the histogram suggests you can hit a couple of odd
> side effects. The simpler numbers also makes it easier (I think) to
> visualise the data.
>
> The density used for frequency histograms is generally 1/(2 * row_total) -
> so with my figures I would put 1/(2 * 225301) in your code.
>
> Your avgclen should be 3, by the way, not 5.
>
> An associated thought if you can change the code - I assume you want to
> use an indexed access path for at least some of the status values. Your
> data is a prime candidate for a function-based index of the form: ( case
> when status = 5 then null else status) Alternatively an indexed virtual
> column (particularly nice for 12c where it can be invisible) that exposes
> the same formula, e.g.
> alter table xxxx add (active_status generated always as (case when status =
> 5 then null else status) virtual);
>
>
>
>
>
> Regards
>
> Jonathan Lewis
> http://jonathanlewis.wordpress.com/all-postings
>
> ----- Original Message -----
> From: "Jack van Zanen" <jack_at_vanzanen.com>
> To: <oracle-l_at_freelists.org>
> Sent: Monday, February 27, 2017 2:08 AM
> Subject: stats not playing nice
>
>
> | Hi All,
> |
> |
> | Oracle 11G R2 AIX
> |
> | We have a fairly large table (24G) that has a status field.
> | there are 6 possible statuses and the Majority are status 5(completed)
> for
> | which there is a need to retain for 13 months.
> |
> | Now the stats on that table take fairly long to compute so estimate is
> | used, but as the data is very skewed sometimes we miss out on some values
> | altogether.
> |
> | Now, I think the best solution would have been at creation time to split
> | this up and move the completed records to a separate table partitioned by
> | month and keep the batch processing to use the lean and mean table.
> |
> | Failing that:
> |
> | I can:
> |
> | a) Run a full compute statistic at an opportune time including histograms
> | (takes a long time and I am not sure we get a window for this. and than
> | lock the stats untill we do it again
> |
> | b) keep using estimate statistics and manually insert the values for this
> | one column.
> |
> | I have found a blog post on Jonathan Lewis website where he has an
> example
> | that is fairly easy to follow:
> |
> | This is the Code from Jonathan Lewis blog post from almost a decade ago.
> |
> | I have one question regarding the density. What should I be putting here
> |
> | declare
> |
> | m_distcnt number;
> | m_density number;
> | m_nullcnt number;
> | srec dbms_stats.statrec;
> | m_avgclen number;
> | n_array dbms_stats.numarray;
> |
> | begin
> |
> |
> | m_distcnt := 6;
> | m_density := ??????;
> | m_nullcnt := 0;
> | m_avgclen := 5;
> |
> | n_array := dbms_stats.numarray(0,1,2,3,4,5);
> | srec.bkvals := dbms_stats.numarray( 74086, 152500, 704, 858271,
> | 1257221,156704455); ---Actual distribuution
> | srec.epc := 6;
> |
> | dbms_stats.prepare_column_values(srec, n_array);
> |
> | dbms_stats.set_column_stats(
> | ownname => 'XXXXXX',
> | tabname => XXXXEVENTXXXXX',
> | colname => 'STATUS',
> | distcnt => m_distcnt,
> | density => m_density,
> | nullcnt => m_nullcnt,
> | srec => srec,
> | avgclen => m_avgclen
> | );
> |
> | end;
> | /
> |
> |
> |
> |
> |
> | Jack van Zanen
> |
> |
> | -------------------------
> | This e-mail and any attachments may contain confidential material for the
> | sole use of the intended recipient. If you are not the intended
> recipient,
> | please be aware that any disclosure, copying, distribution or use of this
> | e-mail or any attachment is prohibited. If you have received this e-mail
> in
> | error, please contact the sender and delete all copies.
> | Thank you for your cooperation
> |
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Feb 27 2017 - 22:27:03 CET

Original text of this message