Re: Help with sql query speed. Explain plan included

From: <fitzjarrell_at_cox.net>
Date: 18 Mar 2007 20:11:57 -0700
Message-ID: <1174273917.561699.204480@y80g2000hsf.googlegroups.com>

On Mar 18, 8:36 pm, "Charles Hooper" <hooperc2..._at_yahoo.com> wrote:
> On Mar 18, 8:23 pm, "fitzjarr..._at_cox.net" <fitzjarr..._at_cox.net> wrote:
>
>
>
>
>
> > On Mar 18, 1:56 pm, "Charles Hooper" <hooperc2..._at_yahoo.com> wrote:
> > > You might be thinking of Gaja Krishna Vaidyanatha (author of "Oracle
> > > Performance Tuning 101"), rather than Cary Millsap. Compulsive Tuning
> > > Disorder is a serious problem. However, you have not yet reached that
> > > stage with this problem.
>
> > > I suspect that if you can eliminate the full tablescan that is
> > > requiring the processing of 278MB (system has become IO bound), you
> > > will likely find that the query executes in less than one second. I
> > > suspect that with a SORT_AREA_SIZE of 31MB and a HASH_AREA_SIZE of
> > > 293MB, it will not take but a couple sessions connecting to the
> > > database and executing SQL statements before the server experiences
> > > severe memory problems. My second post (and possibly the first as
> > > well) suggested two indexes: one index on the RESIDENTID column and a
> > > second, composite index on the QUESTIONNUMBER and ANSWER columns.
> > > Since this is a test environment, try creating indexes on those
> > > columns and then use DBMS_STATS.ANALYZE_TABLE_STATS with the CASCADE
> > > option set to TRUE to gather statistics.
>
> > > TKPROF likely does not know what to do with a 10104 trace.
>
> > > Charles Hooper
> > > PC Support Specialist
> > > K&M Machine-Fabricating, Inc.
>
> > I can understand creating a composite index on the QUESTIONNUMBER and
> > ANSWER columns but I fail to see the benefit of having an index on the
> > RESIDENTID column as the pertinent portion of the WHERE clause reduces
> > to, essentially:
>
> > where t1.residentid!=1486674
>
> > as proven by the posted query plan. This returns every row in T1
> > except 1 and Oracle should be 'smart' enough to not use the index for
> > that type of condition. Perhaps rewriting the WHERE clause might also
> > help:
>
> > where (t0.questionnumber=tx.questionnumber
> > and t0.answer=tx.answer)
> > and t0.residentid=1486674
> > and t0.residentid!=tx.residentid
>
> > although with 10g the CBO may already have done that.
>
> > The composite index may help the situation by only returning those
> > RESIDENTID values where the answers match those of the selected
> > record, thus reducing the size of the filtering operation. And an
> > index won't speed filtering.
>
> > David Fitzjarrell
>
> The OP is accessing the same table twice in the query with two
> aliases: "test1 t0,test1 tx". The t0 alias is being accessed by the
> RESIDENTID column: "t0.residentid=1486674". To access those rows, and
> index on just the RESIDENTID should be helpful. I think that we both
> agree that such an index will not help this portion of the WHERE
> clause "t1.residentid!=1486674" The one index on the table (IDXE) is
> a composite index covering all three columns. What I was suggesting
> is to test performance with this one three column index split into two
> indexes. The two column composite index will hopefully be able to
> satisfy the "t0.questionnumber=tx.questionnumber and
> t0.answer=tx.answer" portion of the query without requiring a full
> tablescan of table TEST1.
>
> Thinking about it a bit more, I think that I understand what you are
> saying - the number of blocks in the two column index may be the same
> or greater than that of the three column table. His query would have
> to check ((3,000,000 - 80) / (3,000,000) * 100)% of the rows in the
> table, and that looks to be quite a bit greater than 15% to 25%. It
> could be interesting to see if Oracle could optimize this
> situation... if each question has an even distribution between the
> 100 possible answers.
>
> Charles Hooper
> PC Support Specialist
> K&M Machine-Fabricating, Inc.- Hide quoted text -
>
> - Show quoted text -

The distribution is not even, nor is it guaranteed to be complete within the range from 1 to 100 for any given question, and this was stated at the outset by the OP:

4. There are 80 questions on each census form. 5. Each question on the census form has an answer that is between 1 and 100.
6. The country residents do not have to answer every question. 7. If they do not answer a question, a row in the table is not created.

I do think the composite index would provide some needed help to this query, and the index on QUESTIONNUMBER and ANSWER should be implemented. I also agree that the lone index on this table should be broken into two separate pieces. My thoughts are that the singlecolumn index on RESIDENTID, although present, would likely not be used to eliminate the table scan, however I do believe the two-column index would reduce the number of returned rows and thus improve the overall speed of the query.

David Fitzjarrell Received on Sun Mar 18 2007 - 22:11:57 CDT