WorkArea allocation and/or Tuning for a GROUP BY

From: Hemant K Chitale <hkchital_at_singnet.com.sg>
Date: Thu, 27 May 2010 23:36:05 +0800
Message-Id: <201005271536.o4RFa0Lo031250_at_smtp42.singnet.com.sg>

Oracle 10.2.0.4 32bit Linux
P_A_T set to 2G

We have a particular SQL operation where we need to add values from two different tables and insert them into a target table.
(Thus, for every "key combination", 40 different columns hold
measures which need to be added together --- assume that for a particular "key combination" one table has values for April-2010 and the other has values for March-2010 and we need to a) Add values in 25 of the columns
b) Compute the average in the other 15 columns

The "key combination" itself is another 20columns.

This is done with a UNION operation (we don't specify UNION ALL) as :

INSERT /*+ PARALLEL (t 4) APPEND */ INTO Target_Table SELECT /*+ PARALLEL (S 4) */ c1, c2, c3, c4, SUM(c5), SUM(c7), sum(c7), sum(c8)/2, sum(c9)/2 ...
FROM
(select /*+ PARALLEL (a 2) */ c1, c2, c3, c4, c5, ... from temp_source_1 a

UNION ALL
select /*+ PARALLEL (b 2) */ c1, c2, c3, c4, c5, ... from temp_source_2 b) s GROUP BY c1, c2, c3, c4

(thus, c1 to c4 are the 20 "key combination" columns and c5 to c9
are the 40 columns where I have to add the values from the two source tables or compute averages)

The two source tables (temp_source_1 and temp_source_2) have about 10million rows each and are about 7GB in size.
(yes, the rowsize is very large at about 650bytes per row). These
are NOT GTTs.

I explicitly set table_stats for numrows, numblks, avgrlen (to avoid having to run a gather_stats on these two tables).

Yet, I see these statistics :

SQL> select * from v$sql_workarea_active where sql_id = 'abcdefghijklmn';

SQL_HASH_VALUE SQL_ID        WORKAREA_ADDRESS 
OPERATION_TYPE                                               OPERATION_ID
-------------- ------------- ---------------- 
------------------------------------------------------------ ------------
POLICY                    SID  QCINST_ID      QCSID ACTIVE_TIME

WORK_AREA_SIZE EXPECTED_SIZE ACTUAL_MEM_USED MAX_MEM_USED

------------------ ---------- ---------- ---------- ----------- 
-------------- ------------- --------------- ------------
NUMBER_PASSES TEMPSEG_SIZE TABLESPACE                       SEGRFNO#    SEGBLK#
------------- ------------ ------------------------------ ---------- ----------
     2538664469 abcdefghijklmn 0000000430FCDB80 GROUP BY 


(HASH)                                                         6

AUTO                       89          1         22   150064850 
  3086336       3134464         3318784      3318784
             0   1283457024 TEMP_TBS                        4          9
     2538664469 abcdefghijklmn 0000000430FCDB80 GROUP BY 


(HASH)                                                         6

AUTO                       41          1         22   150065009 
  3086336       3134464         3318784      3318784
             0   1275068416 TEMP_TBS                        6          9
     2538664469 abcdefghijklmn 0000000430FCDB80 GROUP BY 


(HASH)                                                         6

AUTO                       60          1         22   150065714 
  3086336       3134464         3318784      3318784
             0   1283457024 TEMP_TBS                        2          9
     2538664469 abcdefghijklmn 0000000430FCDB80 GROUP BY 


(HASH)                                                         6

AUTO                      119          1         22   150064863 
  3086336       3134464         3318784      3318784
             0   1283457024 TEMP_TBS

TEMPSEG_SIZE rapidly grows to the whole 27GB temp tablespace. (You can see it being 1.28GB for each of the 4 PQ operators doing the GROUP BY (HASH) in the query above while monitoring that job). However, WORK_AREA_SIZE, EXPECTED_SIZE, ACTUAL_MEM_USED, MAX_MEM_USED are all about 3MB. Is there any way I can get the GROUP BY (HASH) to use more Memory -- to have it start with a larger Expected_Size and then a high ACTUAL_MEM_USED and MAX_MEM_USED ? I have seen that a HASH JOIN operation between the same two tables for the same set of rows *does* work with EXPECTED_SIZE of 560MB and ACTUAL_MEM_USED of about 620MB per PQ operator.

Hemant K Chitale

http://hemantoracledba.blogspot.com

--
http://www.freelists.org/webpage/oracle-l

Received on Thu May 27 2010 - 10:36:05 CDT

This message: [ Message body ]
Next message: Herald ten Dam: "RE: AIX Concurrent I/O for Oracle archive logs"
Previous message: Harel Safra: "RE: RMAN again... feedback needed"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message