RE: Histogram Helper -- Follow up

From: Deshpande, Kirti <kirti.deshpande_at_verizon.com>
Date: Mon, 09 Apr 2001 08:42:38 -0700
Message-ID: <F001.002E53CB.20010409082525@fatcity.com>

Hi Steve,
Thanks for posting this. It is a great help. Now, you are the 'Histogram' expert on the list and I have your e-mail address in my personal address book ;)
I have not yet run Steve A's script on my databases. But it is definitely in the plans.
Thanks again for all the testing and for sharing your results.

Kirti Deshpande Verizon Information Services http://www.superpages.com

> -----Original Message-----
> From: Steve Orr [SMTP:sorr_at_arzoo.com]
> Sent: Monday, April 09, 2001 10:46 AM
> To: Multiple recipients of list ORACLE-L
> Subject: RE: Histogram Helper -- Follow up
>
> THANKS for all your feedback. To sum up the thread and my conclusions...
>
> >I had fair amount of success using Histograms for all indexed columns. I
> >have been using it since 7.3.4.x version. However, histograms do not
> offer
> >any help if the SQL uses bind variables. Most of our application code
> does
> >use bind variables and hence does not benefit from this really neat
> feature.
> >- Kirti Deshpande
>
> Thanks Kirti and good point however this is what I found in the Oracle
> Docs:
> "Histograms are NOT useful for columns with the following characteristics:
> ALL, yes ALL [emphasis mine] predicates on the column use bind
> variables..."
> The query in question only needed one bind variable but the same column
> appeared elsewhere in the WHERE clause so the histograms helped.
>
> >The documentation has under-sold this feature. Even for columns that do
> not
> >look very skewed at all, histograms helps to turn ugly explain-plans to
> very
> >fast ones. Somewhere inside the optimizer, histograms help it decide on
> >sensible join-orders amongst other things. You didn't say, but I suspect
> >your dramatic improvement is in the different join order.
>
> >If you can, compute for table for ALL columns for all indexes (3 alls and
> >leave out the indexed keyword). This will do all columns, but may not be
> >feasible because it could blow out sys.histgrm$ depending on how many
> >tables/columns/partitions you have. Do not specify the buckets -- let the
> >analyze choose. It will vary the number depending on what it finds.
> >Author: Binley Lim INET: Binley.Lim_at_ird.govt.nz
>
> Here's what I did:
> execute DBMS_STATS.CREATE_STAT_TABLE('RZUPROD','RZUSTATS','P2PTBL128K');
> execute DBMS_STATS.GATHER_SCHEMA_STATS( -
> ownname => 'RZUPROD', -
> method_opt => 'FOR ALL INDEXED COLUMNS SIZE 100', -
> cascade => TRUE, -
> options => 'GATHER', -
> stattab => 'RZUSTATS', -
> statid => '1' );
>
> I figured I could judge the impact on the data dictionary by storing the
> statistical results in my own table first. When I reviewed the table I
> could
> see the stats but I had a heck of a time trying to figure out where the
> histogram data was. (It's a funky table.) Eventually it dawned on me...
> this
> nice procedure from Oracle does store the stats in your own table... WITH
> THE EXCEPTION OF THE HISTOGRAM DATA! Thanks a lot Oracle. The purpose of
> this procedure is to get the results from an analyze without affecting the
> dictionary. Nothing in the docs about histograms being the exception.
> Luckily the schema didn't have a ton of tables and indexes so the impact
> was
> acceptable. After tracing through catalog.sql and sql.bsq I concluded the
> data was stored in the SYS.HIST_HEAD$ and HISTGRM$ tables with HISTGRM$ in
> the C_OBJ#_INTCOL# cluster. Total storage came to just over 2MB so no
> prob.
> A word to the wise, if you're going to do histograms on all columns or all
> indexed columns, first figure out how many columns that is then guess how
> many rows will be generated and calculate the storage requirements based
> on
> the column definitions you'll find in sql.bsq. The best you can do is
> guestimate because you don't know how many buckets will be created for a
> given column. You might want to limit the buckets to 10 on the first pass.
> Or better yet, use Steve Adams' fantastic script! Here's Steve's comments:
>
> >Good idea, but unfortunately the table statistics and the basic column
> >statistics (called single-bucket histograms) that are available if you
> only
> >analyze the table are not sufficient to determine whether the
> distribution
> >of values for any column is uniform or skewed. That means that we cannot
> do
> >this with a data dictionary query; we will have to scan the table.
> >Based on your suggestion I have written "consider_histogram.sql" at
> >http://www.ixora.com.au/scripts/query_opt.htm#consider_histogram...
>
> THANKS! Great script. That's exactly what I had in mind and I was prepared
> to take on the task but I went home for the evening and in your email this
> morning I found that you had already done it all! And better than I could
> have done as I was entertaining PL/SQL. Great technique with inline views
> and CASE expressions! In your script comments you wrote:
> -- Author: Steve Adams -- based a suggestion from Steve Orr
> Well at least I got to do something :-) Thanks again Bro, er, Mate. :-)
>
> One final (edited) gem from Cherie...
> >I too experienced dramatic improvement when using a couple of histograms
> on
> >very heavily skewed columns. The main thing with histograms is that you
> need
> >to keep them updated. I experimented with creating histograms on all
> indexed
> >columns. Didn't hurt anything but didn't help anything unless the data
> was
> >skewed. I've posted several times in the past about histograms and
> recieved
> >very few replies. That leads me to believe that not many people are using
> or
> >are interested in histograms. I don't see them heavily emphasized at
> >conferences or in books.
>
> Well said Cherie and I couldn't agree more. So when you present this topic
> at the next Oracle conference I'll be there. :-)
>
> The SQL in question (not mine!) is executed several times a minute via
> DBMB_JOBS and several JAVA threads. Here are my before and after stats:
> Rows
> scanned per second went from 300,000 to <500; consistent reads per second
> went from 75,000 to <500; CPU usage went from 85-100% to 5%; buffer cache
> hit ratio went from 95% to 99%, and the physical I/O rate dropped
> significantly.
>
> Histogram Happy,
> Steve Orr
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.com
> --
> Author: Steve Orr
> INET: sorr_at_arzoo.com
>
> Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
> San Diego, California -- Public Internet access / Mailing Lists
> --------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Deshpande, Kirti
  INET: kirti.deshpande_at_verizon.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Received on Mon Apr 09 2001 - 10:42:38 CDT