Re: SQL (optimize) question

From: Richard Foote <richard.foote_at_bigpond.com>
Date: Wed, 3 Jul 2002 16:11:26 +1000
Message-ID: <MtwU8.26399$Hj3.81935@newsfeeds.bigpond.com>

Hi Stephan,

Whether or not an index on the history_y_n column or incorporating this column in another index is of benefit really does depend. Daniel is correct in saying ensure the statistics are accurate, try it with, try it without an index and see the results in the explain. The answer will be there to see. Sivaraj is also correct in saying that histograms may also be of benefit to the optimizer by providing it with some essential additional information.

Note however that a FTS could still be the preferred access path, even with an index, even with only 2% of data. That's why it depends. It's the percentage of accessed *blocks* (and the distribution of data) not rows that is the determining factor here. The FTS with it's usage of multiblock reads and it's ability to be performed in parallel makes it such an attractive access path. However, as it's the "newer" rows you're after and as they are likely to be clustered together, the index clustering_factor is likely to be appealing for an index access. It all depends.

One last point. If a FTS is the preferred access path, that's fine but it might not be fine in 3 months time. As the number of rows increases but hence the percentage of history_y_n = N columns decreases (assuming it's always around the 10,000 mark) then the likelihood of the index being appealing to Oracle increases. That's why regularish (is that a word ?) updating of the statistics is important, in particular in scenarios such as this when the proportional distribution of values is going to change over time.

In summary, test it and see.

Good Luck

Richard

"Sivaraj Shanmugham" <Sivaraj.Shanmugham_at_oracle.com> wrote in message news:3D20FE67.34AC3D7F_at_oracle.com...
> Did you explore the option of using histograms?
> The data distribution you are talking about can use histograms to
determine if
> an index is needed.
> Histograms help you identify the density of a particular value within a
column..
>
> In your case, only 2% of records have HISTORY_Y_N = 'N' and the rest have
'Y'
> If you use histograms, optimizer will determine if an index is required
for the
> query. If the
> query has where clause: " where history_y_n = 'N' ", index will be used.
If
> "history_y_n = 'Y',
> index may not be used .. You will be able to see that from the explain
plan you
> generate.
> You can create an index on the column and use explain plan to see if
further
> tuning is required.
>
> Thanks,
> Sivaraj.
>
>
>
>
> Stephan wrote:
>
> > Hi there,
> >
> > In a table store data coming in from measurements outside.
> > In another table i store commands send to another application.
> > Many of the data is 'old' after a month, but for reporting activities
may
> > not be deleted.
> >
> > Therefore I introduced a field HISTORY_Y_N in both tables.
> > For new records the field HISTORY_Y_N is set to 'N' and after some time
the
> > HISTORY_Y_N field is set to 'Y'.
> >
> > Now I wonder for performance reasons:
> > Imagine you have to read some records from the table.
> > You only want to use records where where HISTORY_Y_N = N.
> > Do I have to use hints?
> > Do I have to create an index on the HISTORY_Y_N field?
> >
> > Example (say the data_table has 500.000 records, 10.000 have
HISTORY_Y_N=N
> > and all other reocords haev HISTORY_Y_N=Y)
> > SELECT incoming_data FROM data_table WHERE status=' I' and
HISTORY_Y_N=N
> > Does Oracle optimize for me (that is only look in 10.000 records for
records
> > having status=' I')?
> >
> > Thanks
> > Stephan
>
Received on Wed Jul 03 2002 - 01:11:26 CDT