RE: improve cardinality estimation

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 27 Oct 2020 14:14:34 -0400
Message-ID: <227e01d6ac8d$0686ebc0$1394c340$_at_rsiz.com>



well, first try everything that JL wrote. I WILL NOT create an ordered list of who knows how best to beat the CBO into sanity, but he would be on anyone’s top 10, INCLUDING the folks writing it.  

also, it did seem strange that someone was asking so useful a question with that so questionable code. You living with already written code is indeed a sad tale and true. (Lemon Tree).  

oh, and sorry I missed the column name change to trunc(nvl(col_to_date,sysdate+1)),  

which you clearly figured out.  

Since this table is SO tiny, so *maybe* we can ignore maintenance costs:  

I wonder what would happen if you made a function index on:  

trunc(col_from_date), trunc(nvl(col_to_date,sysdate+1))  

and you dropped/created just before the query at least each day. (You probably have to instantiate sysdate+1 as a constant to make that an index, unless I’m misremembering that dynamic values can’t be part of indexes.) I think that problem right now is that it has no idea from static stats that you are going to include all those nulls on col_to_date. Sigh, so if you have to change the column function to create it, that probably won’t match up in the use the matching function index reference. I have literally forgotten whether there is a reasonable way to finesse this without changing the code. Good luck to JL, et. all, I’ll be reading.  

I don’t suppose you could actually update col_to_date to one day in the future (from null) every day and re-calculate the stats. You would have to keep track of the ones that are actually one day in the future when you did it yesterday so you would not update them today. Presumably this is all some current versus future or expired list. A daily update should give you good query stats, but you would have to check whether it breaks anything in your suite. I’d go 50-50 on that one, but you don’t want my percentage guesses.  

Then, of course it would depend on how often had to do that. Possibly it works out doing it once a day just after midnight.  

Good luck. I spend about 80 percent of my career getting crap code in packaged products to run decently, 80 percent of my career re-writing crap code that customers did control to run better, and the rest advising folks to not write crappy code in the first place and running my business. I’m not sure what the total percent comes out to, but it was enough to occasionally annoy my wife.  

mwf    

From: Laurentiu Oprea [mailto:laurentiu.oprea06_at_gmail.com] Sent: Tuesday, October 27, 2020 1:28 PM
To: Mark W. Farnham
Cc: ORACLE-L
Subject: Re: improve cardinality estimation  

Thanks for the answer.  

Long story short at this point I need to live with this crappy code.  

I found that this low cardinality determines the optimizer to choose for one particular query a NLJ (duration 1 hour) instead of HJ (duration 1 minute).  

Rewriting the where clause like:  

where (sysdate between trunc(col_from_date) and trunc(col_to_date) ) or col_to_date is null

gives perfect cardinality  

The question will be: is there a smarter solution to give oracle enough statistical information so that he can figure out a good cardinality with the current where clause?    

În mar., 27 oct. 2020 la 18:51, Mark W. Farnham <mwf_at_rsiz.com> a scris:

do you mean like:  

where

col_from_date is null

or ( trunc(col_from_date) between sysdate and sysdate+1)  

it *should* be able to get the number of nulls and a decent number in the range estimate from the stats.  

I would also look into your exact query return desires to see if you can lose the trunc on col_from_date with an appropriate formulation of sysdate boundaries…  

For a given call, sysdate is a constant, and the optimizer tends to do better with functions and calculations on the constant rather than the value in the index or table.  

in general I find it less confusing for human readers to write:  

<column_value> is between x and y  

rather than x is between <column_value> and some_function_on(<column_value>) that might return y.  

even  

trunc(nvl(col_to_date,sysdate+1)) is between sysdate and sysdate+1  

is more understandable, although that obscures that you want to include all null values a bit. Essentially that is the difference between writing “I want all the nulls and all the actual dates from now through the next day” in place of “I’m going to pretend all the nulls are now plus a day, and I want all the values from now through now plus a day.”  

good luck.  

I hope I read all that right and wrote all the right. I need more coffee and I’m old, rusty, and cranky.  

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Laurentiu Oprea Sent: Tuesday, October 27, 2020 11:31 AM To: ORACLE-L (oracle-l_at_freelists.org) Subject: improve cardinality estimation  

Hello,  

I found a query with a very bad execution plan due to low cardinality evaluation of below where clause  

where

sysdate between trunc(col_from_date) and trunc(nvl(col_to_date,sysdate+1))  

Is there any way I can improve the estimate (like extended stats, etc) ?  

Thanks for your help.

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Oct 27 2020 - 19:14:34 CET

Original text of this message