Home » RDBMS Server » Performance Tuning » Query performance gain using statistics
Query performance gain using statistics [message #470225] Mon, 09 August 2010 06:16 Go to next message
jackman1
Messages: 8
Registered: August 2010
Junior Member
Hi,
Somewhere I read that we should not use hints in Oracle production environments, but we can use hints in the development environment and on achieving the desired execution plan we can adjust the 'statistics' to follow that plan without hints.

Q1. If it is true then can anybody please point out what statistics do we adjust for influencing the execution plan and how?

For example, I have the following simple query:

select e.empid, e.ename, d.dname
from emp e, dept d
where e.deptno=d.deptno;

emp.empid, emp.deptno and dep.deptno columns have indexes and the tables have the standard structure as found in the basic oracle examples.

If I look at the execution plan of the above query then I see that the driving table is emp
and the driven table is dept.
Also the type of join that is taking place is 'Nested Loop'.

Questions:
With respect to the above query,
Q 2. If I want to make dept the driving table and emp the driven table then how can I adjust the statistics to achieve that?
Q 3. If I want to use hash join instead of a nested loop join then then how can I adjust the statistics to achieve that?

I can put the ordered and the use_hash hint to effect this but again I have heard that altering statistics is a more robust way to control an execution plan as compared to hints.

Regards.
Re: Query performance gain using statistics [message #470233 is a reply to message #470225] Mon, 09 August 2010 06:32 Go to previous messageGo to next message
Roachcoach
Messages: 1576
Registered: May 2010
Location: UK
Senior Member
jackman1 wrote on Mon, 09 August 2010 12:16
Hi,
Somewhere I read that we should not use hints in Oracle production environments


As far as I know, I'd view that as a guideline more than a hard and fast rule. You shouldn't be able to do a better plan than the optimiser, but sometimes you can: Often by knowing something about the data/query which Oracle could not.

[Updated on: Mon, 09 August 2010 06:33]

Report message to a moderator

Re: Query performance gain using statistics [message #470234 is a reply to message #470225] Mon, 09 August 2010 06:34 Go to previous messageGo to next message
Mahesh Rajendran
Messages: 10707
Registered: March 2002
Location: oracleDocoVille
Senior Member
Account Moderator
>>Q1. If it is true then can anybody please point out what statistics do we adjust for influencing the execution plan and how?

Not entirely true.
There are many cases with which HINTS could be the only option.
Generic rule is to
* Analyze your data
* Collect statistics on tables/indexes.
* Collect histograms.
* Let CBO decide what is the best.
* If CBO chooses a sub-optimal plan, investigate it by enabling 10053 tracing

>>Q 2. If I want to make dept the driving table and emp the driven table then how can I adjust the statistics to achieve that?
>>Q 3. If I want to use hash join instead of a nested loop join then then how can I adjust the statistics to achieve that?
Again, you may not want to "adjust" the statistics. Just collect the "correct" stats.

You can "adjust" the statistics to get a desired plan.
I would not do it in production and will let CBO to do its job.
Use
set_table_stats or set_column_stats accordingly.
Here is an old thread that talks about this.
http://www.orafaq.com/forum/mv/msg/62740/170424/42800/#msg_170424

Be warned CBO behavior changed a lot with a later versions but the basic concepts remain the same.

[Updated on: Mon, 09 August 2010 12:16]

Report message to a moderator

Re: Query performance gain using statistics [message #470363 is a reply to message #470234] Mon, 09 August 2010 11:57 Go to previous messageGo to next message
jackman1
Messages: 8
Registered: August 2010
Junior Member
Nicely explained Mahesh. A little more clarification on the last pint "If CBO chooses a sub-optimal plan...." would be great.
Is there any definitive (or approximate) way of knowing that we have a sub-optimal plan?
Or should we always measure against a defined benchmark?
Regards.
Re: Query performance gain using statistics [message #470371 is a reply to message #470363] Mon, 09 August 2010 12:15 Go to previous messageGo to next message
Mahesh Rajendran
Messages: 10707
Registered: March 2002
Location: oracleDocoVille
Senior Member
Account Moderator
>>Is there any definitive (or approximate) way of knowing that we have a sub-optimal plan?
For example,
you know the data and its distribution, there are proper indexes and still CBO does not use an index where it
is supposed to use it.
CBO does its job based on available statistics (and histograms).
If the available information is wrong/outdated (say, the underlying data has changed),
the plan generated by CBO may not optimal. In some cases,
improper histograms could make CBO hallucinate.

10053 is a trace you can enable to see what
influenced CBO to choose a particular plan.
Have a look at this previous thread. There is a sample 10053 output posted and explained.

http://www.orafaq.com/forum/mv/msg/159883/466473/42800/#msg_466473


Re: Query performance gain using statistics [message #475091 is a reply to message #470225] Sun, 12 September 2010 19:51 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
If you want a query to use a specific plan, you are better off getting the plan generated somewhere and then saving the plan as a plan profile, then use the profile. This is an accepted way of doing it. It is even an Oracle recommended method, (see Real Application Testing).

Still, hints are easy and if they solve your problem, it is also acceptable. As much as I'd like not to see them, the fact is I have two or three queries in my systems (of tens of thousands of queries) that have hints. Let us say one in a thousand queries have an embedded hint. I accept it as a quick and easy way to get what I want. The thing to avoid is systems that do it as a regular practice.

But as others have already said, the CBO is pretty good these days.

Getting the right plan from the CBO is all about giving it information about your data and your system. If the CBO has sufficient information and that infomration is correct, then 99 times out of 100 (maybe more), the CBO will produce an excellent plan, if not the best possible plan. So the trick is to figure out how to give the CBO the information it needs, and to know if it is working for you in any specific situation. Mahesh gave a pretty good summary. I will adlib off his list and add an item or two:

1) make sure your data model is defined. That means use constraints. Create Primary Key, Unique Key, and Foreign Key constraints on all your tables according to your data model. Constraints are information. The CBO uses this information. If you do not provide constraints in your system, you are not providing the information the CBO needs.

2) use the right data type definition for columns. This means two things (number=number, date=date, string=varchar2), and NOT NULL=NOT NULL. If you are working with a date, use the date data type. If you can do math on the column use a number (or integer) data type. If a column is always supposed to have a value make sure you define it as NOT NULL. Once again these suggestions are all about giving the CBO more information about your data. If you don't follow through with some rigour, you will be depriving the CBO of information is can work with.

3) Collect statistics using DBMS_STATS. Know what this does and what options are available to you.

4) Histograms? Well, I consider this to be a pretty advanced stats method. Most people can achieve what they need without them. That means Histograms are not well understood and even less do they know how things might change using them. Read about them before you decide to use them, and try getting what you need without them first.

The above is pretty much what Mahesh said, I simply provide what I consider to be clarification.

As to how to know if you have a good plan, here is the rule I use though I am sure others have other methods.

It is pretty much all about the right rowcounts at the right steps. If the CBO has generated a good plan, then for each of the steps in the plan the estimated rows returned by the step will be "reasonable" based on what you know about the data and your query. This means you can use the estimated rows returned as a basic guide to know if you have a good plan.

If however you have a query that appears to be taking much too long, check the plan to see if there are one or more steps where the estimated rows returned looks stupid. If so, then you most likely have a "bad" plan. Of course at this point the question becomes what do you do about it. Well, your first step is to follow through with the steps Mahesh has suggested and which I have clarified and enhanced. If that don't fix it then you either look at query rewrites, or you do research on specific situations where bad estimated rows returned is a known problem and see what others do about them. One example is GLOBAL TEMPORARY TABLES.

Good luck, Kevin
Re: Query performance gain using statistics [message #475353 is a reply to message #475091] Tue, 14 September 2010 10:08 Go to previous message
jackman1
Messages: 8
Registered: August 2010
Junior Member
Mahesh and Kevin - thanks a lot to both of you for your excellent replies. Life is a little easier now.
Regards.
Previous Topic: Need to tune query
Next Topic: To find the indexes (used and unused on the table
Goto Forum:
  


Current Time: Tue Apr 30 06:06:18 CDT 2024