RE: [SPAM] RE: stats not playing nice

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Mon, 27 Feb 2017 20:11:48 -0500
Message-ID: <00a701d2915f$a3b6c2f0$eb2448d0$_at_rsiz.com>



You’re free to not like it.  

The difference between logical meaning and physical storage is often a stretch.  

I probably should be clear that creation of such “sparse” indexes is not a new idea. In the Oracle context it appeared at least in the V6 tuning guide and probably was in V5 as well.  

I believe my papers on the details for FIFO, priority and a host of other issues are floating around the ‘net. If they are hard to find I can publish them on Oaktable.  

JL’s suggestion of a function-based index is a fine work-around for those that are queasy about marking a job control column “NULL” defined by themselves as “Nothing to see here.”  

And of course nothing stops you from recording the “doneness” (most often useful as a time complete value) at the point the item is “handled” in a different column.

The point is finding rows in a transaction set that need something done to them quickly and efficiently.  

The RDBMS, of course, cannot project a value onto NULL. You, however, can.  

mwf  

From: jack.van.zanen_at_gmail.com [mailto:jack.van.zanen_at_gmail.com] On Behalf Of Jack van Zanen Sent: Monday, February 27, 2017 4:27 PM
To: Mark W. Farnham
Cc: Jonathan Lewis; oracle-l_at_freelists.org Subject: Re: [SPAM] RE: stats not playing nice  

Hi Mark,    

Although creative, I am not sure I like the idea of nulling a field when status is actually complete.

I have to give that one some bouncing of my colleagues  

Thanks

Jack van Zanen  

  <https://docs.google.com/uc?id=0BwovDucFT1fXaEREVHNWRWZyNjg&export=download>



This e-mail and any attachments may contain confidential material for the sole use of the intended recipient. If you are not the intended recipient, please be aware that any disclosure, copying, distribution or use of this e-mail or any attachment is prohibited. If you have received this e-mail in error, please contact the sender and delete all copies. Thank you for your cooperation  

On Tue, Feb 28, 2017 at 3:06 AM, Mark W. Farnham <mwf_at_rsiz.com> wrote:

And: If you can arrange for "complete" to be stored in the database as NULL, then you win. Just because Oracle cannot assign a value to NULL in general does not mean you cannot. Then of course you can easily find each distinct value other than complete very quickly via the index and the index is tiny (possibly after one rebuild, since it currently has a lot of space taken up by "complete.")

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Jonathan Lewis Sent: Monday, February 27, 2017 9:44 AM
To: oracle-l_at_freelists.org
Subject: Re: stats not playing nice

A couple of points:

I tend to use simple numbers that follow the pattern of the actual distribution rather than an exact copy of the counts at one moment in time, so for your data I might use (100, 200, 1, 1200,1800,222000) - which totals 225301.

I don't have to worry too much about the actual values compared to the number of rows in the table because the optimizer will scale up the (apparent) sample size in the histogram to the number of rows it thinks are in the table. If you go for very large numbers in the histogram and the num_rows in the table is smaller than the histogram suggests you can hit a couple of odd side effects. The simpler numbers also makes it easier (I think) to visualise the data.

The density used for frequency histograms is generally 1/(2 * row_total) - so with my figures I would put 1/(2 * 225301) in your code.

Your avgclen should be 3, by the way, not 5.

An associated thought if you can change the code - I assume you want to use an indexed access path for at least some of the status values. Your data is a prime candidate for a function-based index of the form: ( case when status = 5 then null else status) Alternatively an indexed virtual column (particularly nice for 12c where it can be invisible) that exposes the same formula, e.g. alter table xxxx add (active_status generated always as (case when status = 5 then null else status) virtual);

Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com/all-postings

  • Original Message ----- From: "Jack van Zanen" <jack_at_vanzanen.com> To: <oracle-l_at_freelists.org> Sent: Monday, February 27, 2017 2:08 AM Subject: stats not playing nice

| Hi All,
|
|
| Oracle 11G R2 AIX
|
| We have a fairly large table (24G) that has a status field.
| there are 6 possible statuses and the Majority are status 5(completed)
for
| which there is a need to retain for 13 months.
|
| Now the stats on that table take fairly long to compute so estimate is
| used, but as the data is very skewed sometimes we miss out on some values
| altogether.
|
| Now, I think the best solution would have been at creation time to split
| this up and move the completed records to a separate table partitioned by
| month and keep the batch processing to use the lean and mean table.
|
| Failing that:
|
| I can:
|
| a) Run a full compute statistic at an opportune time including histograms
| (takes a long time and I am not sure we get a window for this. and than
| lock the stats untill we do it again
|
| b) keep using estimate statistics and manually insert the values for this
| one column.
|
| I have found a blog post on Jonathan Lewis website where he has an
example
| that is fairly easy to follow:
|
| This is the Code from Jonathan Lewis blog post from almost a decade ago.
|
| I have one question regarding the density. What should I be putting here
|
| declare
|
| m_distcnt number;
| m_density number;
| m_nullcnt number;
| srec dbms_stats.statrec;
| m_avgclen number;
| n_array dbms_stats.numarray;
|
| begin
|
|
| m_distcnt := 6;
| m_density := ??????;
| m_nullcnt := 0;
| m_avgclen := 5;
|
| n_array := dbms_stats.numarray(0,1,2,3,4,5);
| srec.bkvals := dbms_stats.numarray( 74086, 152500, 704, 858271,
| 1257221,156704455); ---Actual distribuution
| srec.epc := 6;
|
| dbms_stats.prepare_column_values(srec, n_array);
|
| dbms_stats.set_column_stats(
| ownname => 'XXXXXX',
| tabname => XXXXEVENTXXXXX',
| colname => 'STATUS',
| distcnt => m_distcnt,
| density => m_density,
| nullcnt => m_nullcnt,
| srec => srec,
| avgclen => m_avgclen
| );
|
| end;
| /
|
|
|
|
|
| Jack van Zanen
|
|
| -------------------------
| This e-mail and any attachments may contain confidential material for the
| sole use of the intended recipient. If you are not the intended
recipient,
| please be aware that any disclosure, copying, distribution or use of this
| e-mail or any attachment is prohibited. If you have received this e-mail
in
| error, please contact the sender and delete all copies.
| Thank you for your cooperation
|

--
http://www.freelists.org/webpage/oracle-l



--
http://www.freelists.org/webpage/oracle-l



 



--
http://www.freelists.org/webpage/oracle-l
Received on Tue Feb 28 2017 - 02:11:48 CET

Original text of this message