Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: why does this statement take so long to return?
gdas wrote in message <013b80b8.1e5f63d6_at_usw-ex0102-013.remarq.com>...
>Here's a better plan output
>
>Query Plan
>------------------------------------------------------------
>SELECT STATEMENT Cost = 3237
> SORT GROUP BY
> FILTER
> NESTED LOOPS
> NESTED LOOPS
> NESTED LOOPS
> TABLE ACCESS FULL ORG
> TABLE ACCESS BY INDEX ROWID SUBJECT
> INDEX RANGE SCAN SUBJECT_ORG_ID_IDX
> TABLE ACCESS FULL HIT
> TABLE ACCESS BY INDEX ROWID DATASTAGE
> INDEX RANGE SCAN DATASTAGE_UQ_ID_IDX
> COUNT STOPKEY
> VIEW
> SORT ORDER BY STOPKEY
> SORT GROUP BY
> HASH JOIN
> TABLE ACCESS FULL ORG
> HASH JOIN
> TABLE ACCESS FULL SUBJECT
> HASH JOIN
> TABLE ACCESS FULL DATASTAGE
> TABLE ACCESS FULL HIT
This query plan is best understood by the following pseudo-code:
for each row in ORG /* Full table scan {
for each row from SUBJECT
where SUBJECT.ORG_ID = current ORG.ORG_ID {
for each row in HIT /* Full table scan { reject if HIT.SUBJECT_ID != current SUBJECT.SUBJECT_ID for each row in DATASTAGE where DATASTAGE.UNIQUE_ID = current HIT.UNIQUE_ID { /* Since the key is unique, this loop only returns /* one row per invocation. check if the ORG.ORG_ID matches the one returned from the subquery. (The subquery is only executed once, not each time the check is made) if match then { build output row from all currently available columns } } }
The killer is the Full table scan on HIT in the outer query, which will be executed once for each row in SUBJECT. Why is it doing a FTS and not doing an indexed retrieval using the condition hit.subject_id = subject.subject_id?
From your earlier postings I can see that there is an index on HIT.SUBJECT_ID. Are the SUBJECT_ID columns on these two tables of the same datatype? Or is one a number and the other a varchar2, causing an implicit data conversion which prevents the use of the index? Is the status of the index 'VALID'?
Solving this will solve the biggest performance issue, but it's still not optimal. Effectively *every* row from the outer query will be constructed and tested against the subquery. Ideally you want this the other way round: ie. the subquery is executed first, and the single ORG_ID returned by it is used to drive the outer query.
Contrary to my previous suggestion (which was made without seeing the query plan), I'd now try the /*+ PUSH_SUBQ */ hint to try to force the subquery to execute first. Try also concatenating nulls or adding zeros to the where conditions (intelligently) to help force the query to optimise in the way you want.
Try this:
select /*+ PUSH_SUBQ */
al4.org_name, al1.type, count (al2.hit_id) from admin.datastage al1, admin.hit al2, admin.subject al3, admin.org al4 where al1.unique_id = al2.unique_id || null and al2.subject_id = al3.subject_id || nulland al3.org_id = al4.org_id || null
(select org_id from ( select aa.org_id, count(bb.hit_id) b from hit bb, org aa, datastage cc, subject dd where (bb.unique_id = cc.unique_id and bb.subject_id = dd.subject_id and dd.org_id = aa.org_id) and (aa.type = 'Regional' and cc.type in ('A','B','C','D') and cc.account_id = 1) group by aa.org_id order by 2 desc ) where rownum=1 )
I've assumed that al4.org_id, al3.subject_id and al2.unique_id are all varchar2s. If they're numbers, replace the '|| null' by '+ 0'.
The idea is that the subquery should execute first (because of the PUSH_SUBQ hint), then the order of the joins to the other tables is forced by the '|| null's: the query cannot join to al3 before al4.org_id || null has been evaluated, similarly the join to al2 cannot be made until al3.subject_id || null has been evaluated etc.
So the join order should be:
subquery -> ORG (al4) -> SUBJECT (al3) -> HIT (al2) -> DATASTAGE (al1).
It may also be possible to optimise the subquery as well, but we'll tackle this if it's needed after the first tuning problem has been solved.
Dave.
-- If you reply to this posting by email, remove the "nospam" from my email address first.Received on Tue Jun 27 2000 - 00:00:00 CDT