Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: sql query optimization

Re: sql query optimization

From: Guang Mei <gmei_at_incyte.com>
Date: Tue, 10 Jun 2003 19:03:08 -0700
Message-ID: <F001.005AEC26.20030610183423@fatcity.com>


Hi:

Thanks for your help and suggestions, Mark.

I have done some of the things you suggested already. Please see my text below.

On Tue, 10 Jun 2003, Mark Richard wrote:

>
> This is an interesting (and relatively complex) query with what I think are
> several opportunities to tune it. I'd probably spend some time looking at
> the following to see if they might help you out:
>
> 1) Look at the sub-select with the connect by clause... Try executing
> that query on it's own and get an idea of it's execution time and the
> number of rows returned for different bind variables. Depending on the
> number of distinct values of arctype there may be some scope to optimise
> this component. Possibly create a table containing on the arctype 2999999
> and 3000000 records and then remove this clause from the query - this could
> avoid accessing the table at all. I have no idea if creating such a table
> is practical for your scenario though.

I just looked:

[EMAIL PROTECTED]> select count(*) from arc where arctype in (2999999,3000000);

  COUNT(*)


     56932

This is about 27% of the total rows, so I will test to move them into a new table tomorrow and this should help. I did test each part separatley and timed them and I found that the sub-query is probably the bottle-neck because "start ... connect by ..." requires walk the whole index to get all possible nodes (expensive). I can create this new table.

> 2) Consider a concatenated index (perhaps termid, parenttermid or
> parenttermid,termid - too early for my brain to remember without trying)
>

I don't know why concatenated index would help here, for which part in where clause it would?

> 3) Are the distinct and order by clausing really needed. Often a distinct
> is included to hide a fault in the query (like a missing join or criteria)
> - distinct can be very expensive at times but since your query runs fairly
> fast you probably aren't removing many rows. How many rows does the query
> return with versus without the distinct clause?
>

distinct is needed because otherwise it would return duplicates. It's more like 5 rows with distinct and 12 without (typically).

> 4) In the order by clause is "mt.blast.pvaltonumber(pval)" This looks
> like a function call - if you have a way to avoid this function call you
> may see a performance increase. You could test this by creating a table
> which stores the calculated result already and modify the query (remember
> to index and analyze the same as the original table). Does this help? Is
> it practical to store the result? Again though, the benefit will be
> determined by the number of rows being ordered and the amount of query time
> spent doing this - for large data sets a function call is murder though.
>

Yes, it is a function call (from a package), it basically take care of the situation that "pval" (for example) can be "2e-56" or "e-37", I need a way of order them. I tried with "decode" but found it did not improve the performance. I even tried to remove "order by" and the performance is basically the same. This is probably because the number of rows returned is small.

> 5) Finally, I just realised at the last minute that DUALBLASTRESULTS
> appears to be a view. Try bypassing the view and going straight to the
> base tables with the most restrictive criteria you have. Sometimes Oracle
> doesn't handle views really well within queries. I've seen improvements
> where the entire logic of the view was moved within the query - it
> shouldn't have changed anything from a theoretical point of view but it
> did.
>

I did try to replace the view with the base tables and the performance are the same. I have a bounch of other queries using this view and I always get the same performance when
I replace it with base tables. So I think there probably isn't much I can do here.

Again, thanks for your help.

Guang

> Hopefully this gives you some options to look at.
>
> Regards,
> Mark.
>
>
>
>
> "gmei"
> <[EMAIL PROTECTED]> To: Multiple recipients of
> list ORACLE-L <[EMAIL PROTECTED]>
> Sent by: cc:
> [EMAIL PROTECTED] Subject: sql query optimization
> .com
>
>
> 11/06/2003 07:59
> Please respond to
> ORACLE-L
>
>
>
>
>
>
> Hi:
>
> I have been trying for two days to see if I could optimize this query
> without much success. One of the programs here calls this query many many
> times and I hope I could make it run faster. It typically take about 1 sec
> to get the result. I have tried using "exists" to replace "in" and the
> result is not good. All the columns involved in the "where" clause have
> been
> indexed. b1 and b2 are bind variables that are passed in.
>
> ----
>
> select distinct observationlist.geneid, pval, score,
> Decode(evidenceCode, 3000900, 'E', 3000902, 'P', 3000906),
> proteomeRefID, Decode(ReferenceType, 'I', 'Y', 'N'), reftarget
> from mt.dualblastresults, mt.seqtable querySeq,
> isi.observationlist,
> isi.termobs
> where subjID = :b1
> and queryID = QuerySeq.AASeqID
> and querySeq.use='Y'
> and querySeq.geneID=observationlist.geneid
> and curationStatus='E'
> and evidenceCode in (3000900,3000902,3000906)
> and observationlist.id=obsID
> and target='GeneID'
> and termobs.termid in (select termid from isi.arc
> where arctype in (2999999,3000000)
> start with termid = :b2
> connect by prior termid=parenttermid)
> order by mt.blast.pvaltonumber(pval) asc, score desc, geneid,
> decode(proteomerefid, null, 0, 1) desc;
>
> --
>
> This query typically returns 10 or less rows. mt.dualblastresults is a
> view,
> all others are tables. BTW, I need "distinct" and "order by" in the query.
>
> Here is the explain plan and row counts in tables and their definition.
> Anyone has any suggestions to make it run faster?
>
> TIA.
>
> Guang
>
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=715 Card=1 Bytes=124
> )
>
> 1 0 SORT (ORDER BY) (Cost=715 Card=1 Bytes=124)
> 2 1 SORT (UNIQUE) (Cost=662 Card=1 Bytes=124)
> 3 2 NESTED LOOPS (Cost=609 Card=1 Bytes=124)
> 4 3 NESTED LOOPS (Cost=553 Card=1 Bytes=118)
> 5 4 NESTED LOOPS (Cost=550 Card=1 Bytes=106)
> 6 5 NESTED LOOPS (Cost=280 Card=30 Bytes=1830)
> 7 6 VIEW OF 'DUALBLASTRESULTS' (Cost=112 Card=168
> Bytes=8232)
>
> 8 7 UNION-ALL
> 9 8 TABLE ACCESS (BY INDEX ROWID) OF 'BLASTRES
> ULTS' (Cost=102 Card=118 Bytes=2360)
>
> 10 9 INDEX (RANGE SCAN) OF 'BLASTRESULTS_SUBJ
> ID_INDEX' (NON-UNIQUE) (Cost=3 Card=118)
>
> 11 8 TABLE ACCESS (BY INDEX ROWID) OF 'BLASTRES
> ULTS' (Cost=10 Card=50 Bytes=1000)
>
> 12 11 INDEX (RANGE SCAN) OF 'BLASTRESULTS_QUER
> YID_INDEX' (NON-UNIQUE) (Cost=3 Card=50)
>
> 13 6 TABLE ACCESS (BY INDEX ROWID) OF 'SEQTABLE' (C
> ost=1 Card=57344 Bytes=688128)
>
> 14 13 INDEX (UNIQUE SCAN) OF 'ST_ASI_UN' (UNIQUE)
> 15 5 TABLE ACCESS (BY INDEX ROWID) OF 'OBSERVATIONLIS
> T' (Cost=9 Card=499 Bytes=22455)
>
> 16 15 INDEX (RANGE SCAN) OF 'OBSERVATIONLISTGENEID'
> (NON-UNIQUE) (Cost=2 Card=499)
>
> 17 4 TABLE ACCESS (BY INDEX ROWID) OF 'TERMOBS' (Cost=3
> Card=2388115 Bytes=28657380)
>
> 18 17 INDEX (RANGE SCAN) OF 'TERMOBSIDINDEX' (NON-UNIQ
> UE) (Cost=2 Card=2388115)
>
> 19 3 VIEW OF 'VW_NSO_1' (Cost=56 Card=7 Bytes=42)
> 20 19 SORT (UNIQUE) (Cost=56 Card=7 Bytes=126)
> 21 20 FILTER
> 22 21 CONNECT BY
> 23 22 INDEX (RANGE SCAN) OF 'ARC_TERMID' (NON-UNIQ
> UE) (Cost=1 Card=2 Bytes=12)
>
> 24 22 TABLE ACCESS (BY USER ROWID) OF 'ARC'
> 25 22 INDEX (RANGE SCAN) OF 'ARC_TYPETERMPARENT' (
> UNIQUE) (Cost=3 Card=8 Bytes=144)
>
>
>
> SQL> select count(*) from mt.dualblastresults;
>
> COUNT(*)
> ----------
> 22332188
>
> SQL> select count(*) from mt.seqtable ;
>
> COUNT(*)
> ----------
> 373505
>
> SQL> select count(*) from isi.observationlist;
>
> COUNT(*)
> ----------
> 2290858
>
> SQL> select count(*) from isi.termobs;
>
> COUNT(*)
> ----------
> 2388115
>
> SQL> select count(*) from isi.arc;
>
> COUNT(*)
> ----------
> 207375
>
> SQL> desc mt.dualblastresults
> Name Null? Type
> ----------------------------------------- -------- -------------------
> ID NUMBER
> QUERYID NUMBER
> SUBJID NUMBER
> MATCHLEN NUMBER
> IDENTITY NUMBER
> POSITIVE NUMBER
> GAP NUMBER
> PVAL VARCHAR2(16)
> SCORE NUMBER
> QUERYSTART NUMBER
> QUERYEND NUMBER
> SUBJSTART NUMBER
> SUBJEND NUMBER
> CCOMMENT VARCHAR2(300)
> BLASTDATE DATE
> QFRAME NUMBER
> SFRAME NUMBER
> QUERYSPID NUMBER
> SUBJSPID NUMBER
>
> SQL> desc mt.seqtable ;
> Name Null? Type
> ----------------------------------------- -------- --------------------
> ID NOT NULL NUMBER
> AASEQID NUMBER
> DNASEQID NUMBER
> GENEID NOT NULL NUMBER
> USE CHAR(1)
> ALTSPLICE VARCHAR2(128)
> MUTANT VARCHAR2(128)
> STRAIN VARCHAR2(128)
> CDSSTRING VARCHAR2(2000)
> VALID CHAR(1)
> GENOMEPROJ CHAR(1)
> CDNA CHAR(1)
> PARTIALNTERM CHAR(1)
> PARTIALCTERM CHAR(1)
> TRANSID NUMBER
> CCOMMENT VARCHAR2(300)
> EST CHAR(1)
> CLASS VARCHAR2(128)
> SEQDATE DATE
> CURID NUMBER
>
> SQL> desc isi.observationlist;
> Name Null? Type
> ----------------------------------------- -------- --------------------
> ID NOT NULL NUMBER
> GENEID NUMBER
> CURATIONTYPE NUMBER
> PROTEOMEREFID NUMBER
> SOURCEID NUMBER
> SOURCETABLE VARCHAR2(25)
> DESTID NUMBER
> DESTTABLE VARCHAR2(25)
> DESTDATE DATE
> REFERENCETYPE VARCHAR2(1)
> EVIDENCECODE NUMBER
> CURATORID NUMBER
> EDITORID NUMBER
> UPDATESTAMP DATE
> CURATIONSTATUS VARCHAR2(1)
> ORIGINALSTAMP DATE
> NEXTOBS NUMBER
> TARGET VARCHAR2(15)
> REFTARGET VARCHAR2(15)
> TOOL VARCHAR2(25)
> OLDGENEID NUMBER
>
> SQL> desc isi.termobs;
> Name Null? Type
> ----------------------------------------- -------- -----------
> OBSID NUMBER
> TERMID NUMBER
> CONTEXT NUMBER
>
> SQL> desc isi.arc
> Name Null? Type
> ----------------------------------------- -------- ---------------
> TERMID NOT NULL NUMBER
> PARENTTERMID NOT NULL NUMBER
> OBSID NUMBER
> ARCTYPE NOT NULL NUMBER
> ARCID NUMBER
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: gmei
> INET: [EMAIL PROTECTED]
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).
>
>
>
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> Privileged/Confidential information may be contained in this message.
> If you are not the addressee indicated in this message
> (or responsible for delivery of the message to such person),
> you may not copy or deliver this message to anyone.
> In such case, you should destroy this message and kindly notify the sender
> by reply e-mail or by telephone on (61 3) 9612-6999.
> Please advise immediately if you or your employer does not consent to
> Internet e-mail for messages of this kind.
> Opinions, conclusions and other information in this message
> that do not relate to the official business of
> Transurban City Link Ltd
> shall be understood as neither given nor endorsed by it.
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: Mark Richard
> INET: [EMAIL PROTECTED]
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).
>

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Guang Mei
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Tue Jun 10 2003 - 21:03:08 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US