Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: artificial values vs null

Re: artificial values vs null

From: Richard Foote <richard.foote_at_bigpond.com>
Date: Fri, 24 Sep 2004 05:41:02 -0500 (EST)
Message-ID: <0a4201c4a2ec$ffb24030$0100000a@FOOTE>


Hi Niall,

I've seen on quite a number of occasions the damage that "artificial" values can do in stuffing up Oracle's assumption of range value distributions and hence Oracle's ability to select an appropriate execution plan.

Setting end dates to some distant future is full of dangers and is a classic example.

Beware indeed !!

Cheers

Richard
----- Original Message -----
From: "Niall Litchfield" <niall.litchfield_at_gmail.com> To: <nsouto_at_bizmail.com.au>
Cc: <oracle-l_at_freelists.org>
Sent: Friday, September 24, 2004 7:37 PM Subject: artificial values vs null

On Thu, 23 Sep 2004 21:21:40 +1000, Nuno Souto <nsouto_at_bizmail.com.au> wrote:
> Whatever the optimiser may do with 31 dec 2099,
> it will be a darn long shot better than a NULL value
> that can't be indexed...

you may regret making that statement.... see below.

>
> Still, the correct design is to add a "status" column
> where the "current" can be indicated. The high date
> is a poor substitute. But still better than having to do
> range or full table scans to find the current row of a
> particular category.

For this purpose I entirely agree. Though in the case of current 'logins' you can bet your bottom dollar that a crashed session will still be shown as current when it isn't. At least then you have demonstrably incorrect data rather than assigning meaning to meaningless data :)

I have created a script to demonstrate what the 'optimizer' does (10g) with the use of various strategies for assigning an artificial value for what should be a NULL. It isn't pretty. You can see the full results at my site

The script is available at
http://www.niall.litchfield.dial.pipex.com/scripts/tests/date_distribution.s ql
 and you are welcome to play around with it on test systems yourself.

What I find is that *in the absence of histograms* then using an artificial value - especially if you do the other thing app designers do and use the wrong datatype - gives you woefully wrong estimates of cardinalities. This may or may not matter too much in the case of a simple single table query, but if the resulting rowset will be used in a join and the stats for cardinality can be wrong (as they are in at least one of my cases) by an order of magnitude then what the CBO will do will most likely not be sensible at all. histograms do fix this, but at a non-zero cost.

I do of course have to admit that my range scan was chosen not just because it runs from christmas to one of my best friends birthdays - but also because it is a good way to illustrate the perils of bad datatypes.

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Sep 24 2004 - 05:35:01 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US