Home » SQL & PL/SQL » SQL & PL/SQL » Outer join query blowing away my temp tablespace
Outer join query blowing away my temp tablespace [message #186608] Tue, 08 August 2006 13:59 Go to next message
alliejane
Messages: 59
Registered: July 2005
Location: Glasgow
Member
Hi all,

Before I ask for ways of tuning this query I have to point out that YES it does have to be outer joins, unfortunately there is no other way of doing it..

The query is as follows (sorry it's a long one) and at the end I've added the execution plan.

My database is using
WORKAREA_SIZE_POLICY = AUTO
PGA_AGGREGATE_TARGET = 2G

I've got a 12GB temp tablespace.

I'm getting the following result from v$sql_workarea_histogram, as you can see, I'm using all of my 2Gb of memory sorting, trying to write to disk with a one-pass and blowing away my 12Gb temp tablespace. I'm stuck so if any experts can help I would be forever grateful.

LOW_OPTIMAL_SIZE ========> 1073741824
HIGH_OPTIMAL_SIZE =======> 2147483647
OPTIMAL_EXECUTIONS ======> 0
ONEPASS_EXECUTIONS ======> 1
MULTIPASSES_EXECUTIONS ==> 0
TOTAL_EXECUTIONS ========> 1

SELECT STATEMENT

SELECT (127 various columns)
FROM
"SECUR_RE_PR_MATCHED_STAGE" "SECUR_RE_PR_MATCHED_STAGE",
"SECUR_RE_AS_MATCHED_STAGE" "SECUR_RE_AS_MATCHED_STAGE",
"RE_EXTRACT_DATA_STDKEY_LAST" "RE_EXTRACT_DATA_STDKEY_LAST",
"RE_EXTRACT_DATA_STDKEY" "RE_EXTRACT_DATA_STDKEY"
WHERE (substr( "RE_EXTRACT_DATA_STDKEY"."STD_SORT_CODE" , 1, 2) =
substr( "RE_EXTRACT_DATA_STDKEY_LAST"."STD_SORT_CODE" (+) , 1, 2))
AND ( "RE_EXTRACT_DATA_STDKEY"."STD_ACC_NO" = "RE_EXTRACT_DATA_STDKEY_LAST"."STD_ACC_NO" (+) )
AND ( "RE_EXTRACT_DATA_STDKEY"."STD_CUST_NUM" = "RE_EXTRACT_DATA_STDKEY_LAST"."STD_CUST_NUM" (+) )
AND (substr( "RE_EXTRACT_DATA_STDKEY"."STD_SORT_CODE" , 1, 2) =
substr( "SECUR_RE_AS_MATCHED_STAGE"."STD_SORT_CODE" (+) , 1, 2))
AND ( "RE_EXTRACT_DATA_STDKEY"."STD_ACC_NO" = "SECUR_RE_AS_MATCHED_STAGE"."STD_ACC_NO" (+) )
AND ( "RE_EXTRACT_DATA_STDKEY"."STD_CUST_NUM" = "SECUR_RE_AS_MATCHED_STAGE"."STD_CUST_NUM" (+) )
AND (substr( "RE_EXTRACT_DATA_STDKEY"."STD_SORT_CODE" , 1, 2) =
substr( "SECUR_RE_PR_MATCHED_STAGE"."STD_SORT_CODE" (+) , 1, 2))
AND ( "RE_EXTRACT_DATA_STDKEY"."STD_ACC_NO" = "SECUR_RE_PR_MATCHED_STAGE"."STD_ACC_NO" (+) )
AND ( "RE_EXTRACT_DATA_STDKEY"."STD_CUST_NUM" = "SECUR_RE_PR_MATCHED_STAGE"."STD_CUST_NUM" (+) )


Execution Plan
Plan Table
-----------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost |
-----------------------------------------------------------------
| SELECT STATEMENT | | 2M| 1G| 76034 |
| HASH JOIN OUTER | | 2M| 1G| 76034 |
| HASH JOIN OUTER | | 2M| 1G| 55662 |
| HASH JOIN OUTER | | 2M| 795M| 42293 |
| TABLE ACCESS FULL |RE_EXTRAC | 2M| 737M| 18852 |
| TABLE ACCESS FULL |RE_EXTRAC | 4M| 113M| 3838 |
| TABLE ACCESS FULL |SECUR_RE_ | 30K| 5M| 2 |
| TABLE ACCESS FULL |SECUR_RE_ | 38K| 12M| 2 |
-----------------------------------------------------------------
Re: Outer join query blowing away my temp tablespace [message #186654 is a reply to message #186608] Tue, 08 August 2006 22:15 Go to previous message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
The outer-joins are not responsible - inner joins would result in much the same plan with much the same performance (and much the same failure, probably).

Hash-joins are just the bees-knees when joining a big row set to a small one (<50K rows). But it really sucks when both sets of data are very large. You have what looks like 4 large data sets Sad. I won't go into the reason why, you'll just have to trust me.

Now, that is not to say that hash-joins are the worst option. On the contrary, they can be the best performing join method for large data sets, but they are temp-space hungry. There are two other join methods given your current architecture: Nested Loops and Sort-Merge.

- Nested Loops uses no temp space, but it is abysmally slow for joining two large row sets unless you have an index that contains all of the columns referenced in the SQL (but don't go off creating indexes with 16 columns just yet...). When I say abysmally slow, I mean several orders of magnitude: 10-100 times slower than Hash join.

- Sort-Merge uses lots of temp space, but somewhat less than hash joins. So, whilst they are not truly scalable, they can get you going if you are *just* exceeding your temp space. Use hints to get the SQL to use a Sort-Merge join:
SELECT /*+ 
  USE_MERGE(SECUR_RE_PR_MATCHED_STAGE 
            SECUR_RE_AS_MATCHED_STAGE
            RE_EXTRACT_DATA_STDKEY_LAST
            RE_EXTRACT_DATA_STDKEY) */ ...



If the Sort-Merge still runs out of space, you are going to have to get creative. You can increase TEMP space, but how long will that last? You have three options (in increasing order of efficiency AND complexity - but all of them should work):

- Create dirty big indexes on each table that contain all of the columns referenced by the query, with the join-columns first. You are joining on a SUBSTR() function, so the index must be a FUNCTION-BASED INDEX.
This will allow Oracle to perform Nested Loop (no TEMP space required), and it will probably be no more than twice as slow as a Hash-join.

- Partition ALL of the tables on one or more of the join-column(s). This will allow PARTITION-WISE joins. Providing you keep your partitions to no more than a few million rows each, this should be scalable and will provide noticibly better performance than the hash-join over an un-partitioned table.

- Use Hash or Index Clusters. This would provide the best performance of all, and is completely scalable. NEVER, NEVER, NEVER, implement a cluster unless you (or the DBA) understands them fully and know how to maintain them (yes, they need periodic maintenance). For this reason, I am not going to describe them.


Lastly, but most importantly, make sure that you are getting reasonably unique joins between the tables, otherwise it is possibe that you are getting mini-cartesian joins. eg. If two such tables contain 1M rows, but there are only 100K different join-keys in each table (ie. 10 rows per join key), then every row in one table will join to 10 rows in the second table - result: 10M rows, not 1M. Needless to say, this gets MUCH worse when 3 or tables are involved.

Ross Leishman
Previous Topic: HELP Please!!!!
Next Topic: TKPROF
Goto Forum:
  


Current Time: Sun Dec 11 06:23:48 CST 2016

Total time taken to generate the page: 0.09180 seconds