Re: Huge tempspace requirement - doesn't match explain plan

From: Andrew Kerber <andrew.kerber_at_gmail.com>
Date: Thu, 21 Aug 2008 16:59:26 -0500
Message-ID: <ad3aa4c90808211459p7c4e73tfbdad915ab3391bd@mail.gmail.com>


is it possible that each parallel slaving is grabbing the full amount of the expected temp space requirement? I have seen that happen in the past. Ie, if the query expects to need 8g, each parallel slave grabs 8g of temp space.

On Thu, Aug 21, 2008 at 4:51 PM, cam <kadmon_at_gmail.com> wrote:

> Hello all,
>
> I'm having terrible trouble with a complex query that is consuming huge
> amounts of temp space. Sometimes.
>
> This query is large with several nested queries and uses parallelism. One
> upshot of this is that the explain plans, already complex, are very
> challenging to read with the addition of the PX send and receive events -
> they're over a hundred lines long with deep nesting. In text form from
> awrsqrpt they are truncated halfway thru the tmpspc column which doesn't
> help since temp space is a specific concern.
>
> The instance has 320 GB (!) of temp space assigned and this query is using
> it all, but the tmpspc report in the html form of the explain plan generated
> by awrsqrpt seems to show a requirement of less than 10 GB. I'm monitoring
> the tmp space usage with a little script which polls v$tempspace &
> v$sort_usage and I can see each of 64 parallel servers using 8GB each. the
> query runs for about 2 hours before failing with temp space exhaustion.
> Reducing parallelism doesn't seem to have any effect on the eventual temp
> requirement.
>
> On a test instance on another but fairly similar box, the temp space usage
> is peaking at around 18 GB - this is with very similar driving table
> volumes, identical parallelism settings etc. Run completes in under 2 hours.
> Explain plans are different but it is difficult to track down how and why,
> given their sheer size and complexity. Also, different 'successful' runs
> with this smaller temp requirement on the test box are themselves using
> quite different explain plans.
>
> This is 10.2.0.4, full table stats (i.e. estimate=>null, cascade=>true)
> have been gathered for all tables involved in the joins and BVP has been
> disabled with _OPTIM_PEEK_USER_BINDS=false to avoid histograms confusing the
> picture for the moment.
>
> So, I suppose my questions are as follows for anyone willing to read this
> far... I know I haven't posted much to go on - I'm pretty sure I'm unable to
> post actual details for reasons of IPR etc.
>
> - is there a workaround for awrsqrpt truncating horizontal output? I'm
> guessing it would just be to edit the script but haven't dared to yet... The
> HTML version is an acceptable workaround but I thought it might be nice to
> be able to use diff on text versions of plans. I'd also really like to be
> able to see the parent_id of a rowset - at depths like this, the indentation
> is close to useless for figuring out siblings and parents.
>
> - Why would the explain plans estimate for temspace requirement be so
> wildly at odds with what is actually being consumed?
>
> - *Very broadly speaking*, I'm assuming that this massive use of temp space
> is caused by too many hash joins of full table scans - is trying to force
> more nested loops a reasonable strategy to reduce temp requirement? I'd be
> inclined to use optimizer_index_cost_adj and _cache parameters for this
> since I want to avoid hints if possible. Is this a good way to encourage
> NLs?
>
> - Are there any tools/techniques for organising/viewing/comparing large
> explain plans?
>
> Cheers for any insights,
> cam
>

-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Aug 21 2008 - 16:59:26 CDT

Original text of this message