RE: Large Dataset - Estimated 87TB needed for TEMP - suggestions?

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 27 Jan 2015 12:52:39 -0500
Message-ID: <46d101d03a5a$0bac8e30$2305aa90$_at_rsiz.com>



Assuming we don’t get to change the question (which *might indeed be the best solution, but which has already been suggested*) there are a few ways to approach this:  

n just mentioned by Mladen - bitmaps and a star transformation, possibly including bitmap joins

n spiny starfish strategy (see my paper)

n divide and conquer and concatenate the result sections via union all

o suppose you have a useful distribution by STS_DPS_TYPE

o suppose you have few enough STS_DPS_TYPE distinct values or a reasonable way to carve the distinct values up by range yielding a similar aggregate row count from each range

o each of the similar queries linked by union alls would then be smaller, possibly exponentially so

o this can be combined with the spiny starfish strategy  

Further observations:

You are fetching exactly one AND a.STS_PERF_PERIOD = '20141',

so do not select that data forcing you to group by it.  

union all pieces < 36 and > 36 might give you range scans instead of a ffs on c if 36 is at a reasonable break point.  

Don’t haul any text descriptors along you don’t need until the group by totals are complete and you have the minimum rowset result upon which to tack the labels.

That might mean you need some look up tables with ids shorter than the text descriptors at the beginning.  

Good luck.    

mwf      

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mladen Gogala (Redacted sender "mgogala_at_yahoo.com" for DMARC) Sent: Tuesday, January 27, 2015 12:06 PM To: oracle-l_at_freelists.org
Subject: Re: Large Dataset - Estimated 87TB needed for TEMP - suggestions?  

On 01/27/2015 09:11 AM, Chris Taylor wrote:

Env; 11.2.0.2 Linux 5.8

Machine:

4 CPU (8 Core, 16 Threads E5-4620)

256 GB RAM   This is a "data warehouse" DB but doesn't use DIM/FACT tables - more like a data dump database.  

I have a query against 3 tables (2 of which are partitioned) and the estimated TEMP space used for this query is going to be 81 TB. I'm looking for suggestions on how to get this data set. For the partitioned tables, 1 table accesses 1 partition, and the other table uses 2 partitions. I've included the explain plan below and any help/suggestions are appreciated. (I've updated the table stats yesterday after I kept blowing out my 150GB temp space and the estimated temp space used was 22GB. After updating stats, it tells me the estimated Temp Space is going to be 81 TB).  

PGA Agg Target is 20 GB right now.  

I've explained the query using both the default HASH GROUP BY and a SORT GROUP BY and both estimate using 81 T of temp space.  

Here's the query and the plan (let me know if the fixed formatting doesn't show up correctly and I'll use pastebin to format it if needed)  

  SELECT a.STS_PERF_PERIOD,

         a.PART_ACCT_NO,

         a.PWN_PART_AKA_NME,

         a.PWN_P_W_IND,

         a.AFL_SOCIETY_NME,

         b.sts_CHANNEL_TYPE,

         c.stp_s03_program_cd--TOTAL Music Minutes CALCULATION:

         ,

           SUM (               CASE

                 WHEN STG_STATUS = 'DP'

                 THEN

                    0

                 ELSE

                      a.perf_count

                    * ( (  a.BMI_FEATURE


+ a.BMI_BG

+ a.BMI_LOGO

+ a.BMI_THEME

+ a.BMI_BV

+ a.ASCAP_FEATURE

+ a.ASCAP_BG

+ a.ASCAP_LOGO

+ a.ASCAP_THEME

+ a.ASCAP_BV

+ a.FOREIGN_FEATURE

+ a.FOREIGN_BG

+ a.FOREIGN_LOGO

+ a.FOREIGN_THEME

+ a.FOREIGN_BV

+ a.PD_FEATURE

+ a.PD_BG

+ a.PD_LOGO

+ a.PD_THEME

+ a.PD_BV

+ a.SESAC_FEATURE

+ a.SESAC_BG

+ a.SESAC_LOGO

+ a.SESAC_THEME

+ a.SESAC_BV

+ a.NA_FEATURE

+ a.NA_BG

+ a.NA_LOGO

+ a.NA_THEME

+ a.NA_BV))
END) / 60 AS "TOT_MUSIC_MINS"--BMI Music Minutes CALCULATION: , SUM ( CASE WHEN STG_STATUS = 'DP' THEN 0 ELSE a.perf_count * ( ( a.BMI_FEATURE
+ a.BMI_BG

+ a.BMI_LOGO

+ a.BMI_THEME

+ a.BMI_BV))
END) / 60 AS "BMI_MUSIC_MINS"

    FROM PRDM.COWS_PURPLE_PART a, --partition(P20141)

         prdm.cows_TCUE_STS_SCHEDULE b, --partition(P20141,P20142) ,

         prdm.cows_tcue_stp_program c

   WHERE A.STS_PERF_PERIOD = b.sts_perf_period

         AND A.STS_DPS_TYPE = b.sts_dps_type

         AND A.STG_GROUP_NO = b.sts_group_no

         AND a.stg_group_seq_no = b.sts_group_seq_no

         AND b.sts_provider = c.stp_provider

         AND b.sts_program_no = c.stp_program_no

         AND a.STS_PERF_PERIOD = '20141'

         AND b.sts_perf_source = 'CB'

         AND b.COWS_DIST_PERIOD in ('20141','20142')

         AND a.PWN_P_W_IND = 'W'

         AND c.stp_s03_program_cd <> '36'

GROUP BY a.STS_PERF_PERIOD,

         a.PART_ACCT_NO,

         a.PWN_PART_AKA_NME,

         a.PWN_P_W_IND,

         a.AFL_SOCIETY_NME,

         b.sts_CHANNEL_TYPE,

         c.stp_s03_program_cd

/    

Plan hash value: 2657901994  


| Id  | Operation                               | Name                      | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop |

----------------------------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                        |                           |    289G|    57T|       |   451M  (9)|501:19:35 |       |       |

|   1 |  HASH GROUP BY                          |                           |    289G|    57T|    81T|   451M  (9)|501:19:35 |       |       |

|*  2 |   HASH JOIN                             |                           |    289G|    57T|    42M|  2176K (91)| 02:25:07 |       |       |

|   3 |    PART JOIN FILTER CREATE              | :BF0000                   |    680K|    35M|       | 39040   (2)| 00:02:37 |       |       |

|*  4 |     HASH JOIN                           |                           |    680K|    35M|  4304K| 39040   (2)| 00:02:37 |       |       |

|   5 |      INLIST ITERATOR                    |                           |        |       |       |            |          |       |       |

|   6 |       PARTITION RANGE ITERATOR          |                           |  80046 |  3361K|       |  3063   (1)| 00:00:13 |KEY(I) |KEY(I) |

|   7 |        TABLE ACCESS BY LOCAL INDEX ROWID| COWS_TCUE_STS_SCHEDULE    |  80046 |  3361K|       |  3063   (1)| 00:00:13 |KEY(I) |KEY(I) |

|*  8 |         INDEX RANGE SCAN                | COWS_TCUE_STS_SCHED_IDX01 |   9052 |       |       |   147   (1)| 00:00:01 |KEY(I) |KEY(I) |

|*  9 |      INDEX FAST FULL SCAN               | COWS_TCUE_STP_PROG_IDX01  |     24M|   256M|       | 12602   (3)| 00:00:51 |       |       |

|  10 |    PARTITION RANGE SINGLE               |                           |     10M|  1582M|       | 96246   (2)| 00:06:25 |KEY(AP)|KEY(AP)|

|* 11 |     TABLE ACCESS FULL                   | COWS_PURPLE_PART          |     10M|  1582M|       | 96246   (2)| 00:06:25 |    21 |    21 |

----------------------------------------------------------------------------------------------------------------------------------------------

 

Query Block Name / Object Alias (identified by operation id):


 

   1 - SEL$1    7 - SEL$1 / B_at_SEL$1

   8 - SEL$1 / B_at_SEL$1

   9 - SEL$1 / C_at_SEL$1

  11 - SEL$1 / A_at_SEL$1  

Outline Data


 

  /*+

      BEGIN_OUTLINE_DATA       USE_HASH_AGGREGATION(_at_"SEL$1")

      USE_HASH(_at_"SEL$1" "A"_at_"SEL$1")

      USE_HASH(_at_"SEL$1" "C"_at_"SEL$1")

      LEADING(_at_"SEL$1" "B"_at_"SEL$1" "C"@"SEL$1" "A"@"SEL$1")

      FULL(_at_"SEL$1" "A"_at_"SEL$1")

      INDEX_FFS(_at_"SEL$1" "C"_at_"SEL$1" ("COWS_TCUE_STP_PROGRAM"."STP_S03_PROGRAM_CD" "COWS_TCUE_STP_PROGRAM"."STP_PROVIDER"

              "COWS_TCUE_STP_PROGRAM"."STP_PROGRAM_NO"))       INDEX_RS_ASC(_at_"SEL$1" "B"_at_"SEL$1" ("COWS_TCUE_STS_SCHEDULE"."COWS_DIST_PERIOD" "COWS_TCUE_STS_SCHEDULE"."STS_PERF_PERIOD"

              "COWS_TCUE_STS_SCHEDULE"."STS_PERF_SOURCE"))       OUTLINE_LEAF(_at_"SEL$1")

      NO_PARALLEL       DB_VERSION('11.2.0.2')       OPTIMIZER_FEATURES_ENABLE('11.2.0.2')       IGNORE_OPTIM_EMBEDDED_HINTS       END_OUTLINE_DATA   */  

Predicate Information (identified by operation id):


 

   2 - access("A"."STS_PERF_PERIOD"="B"."STS_PERF_PERIOD" AND "A"."STS_DPS_TYPE"="B"."STS_DPS_TYPE" AND

              "A"."STG_GROUP_NO"="B"."STS_GROUP_NO" AND "A"."STG_GROUP_SEQ_NO"="B"."STS_GROUP_SEQ_NO")    4 - access("B"."STS_PROVIDER"="C"."STP_PROVIDER" AND "B"."STS_PROGRAM_NO"="C"."STP_PROGRAM_NO")

   8 - access(("B"."COWS_DIST_PERIOD"='20141' OR "B"."COWS_DIST_PERIOD"='20142') AND "B"."STS_PERF_PERIOD"='20141' AND

              "B"."STS_PERF_SOURCE"='CB')    9 - filter("C"."STP_S03_PROGRAM_CD"<>'36')

  11 - filter("A"."PWN_P_W_IND"='W' AND "A"."STS_PERF_PERIOD"='20141')  

Column Projection Information (identified by operation id):


 

   1 - (#keys=7) "A"."STS_PERF_PERIOD"[VARCHAR2,5], "A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30],

       "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7],        "C"."STP_S03_PROGRAM_CD"[VARCHAR2,2], SUM(CASE "STG_STATUS" WHEN 'DP' THEN 0 ELSE        "A"."PERF_COUNT"*("A"."BMI_FEATURE"+"A"."BMI_BG"+"A"."BMI_LOGO"+"A"."BMI_THEME"+"A"."BMI_BV") END )[22], SUM(CASE "STG_STATUS" WHEN        'DP' THEN 0 ELSE "A"."PERF_COUNT"*("A"."BMI_FEATURE"+"A"."BMI_BG"+"A"."BMI_LOGO"+"A"."BMI_THEME"+"A"."BMI_BV"+"A"."ASCAP_FEATURE"+"A".        "ASCAP_BG"+"A"."ASCAP_LOGO"+"A"."ASCAP_THEME"+"A"."ASCAP_BV"+"A"."FOREIGN_FEATURE"+"A"."FOREIGN_BG"+"A"."FOREIGN_LOGO"+"A"."FOREIGN_TH        EME"+"A"."FOREIGN_BV"+"A"."PD_FEATURE"+"A"."PD_BG"+"A"."PD_LOGO"+"A"."PD_THEME"+"A"."PD_BV"+"A"."SESAC_FEATURE"+"A"."SESAC_BG"+"A"."SE        SAC_LOGO"+"A"."SESAC_THEME"+"A"."SESAC_BV"+"A"."NA_FEATURE"+"A"."NA_BG"+"A"."NA_LOGO"+"A"."NA_THEME"+"A"."NA_BV") END )[22]    2 - (#keys=4) "A"."STS_PERF_PERIOD"[VARCHAR2,5], "C"."STP_S03_PROGRAM_CD"[VARCHAR2,2], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7],

       "A"."FOREIGN_LOGO"[NUMBER,22], "A"."PD_LOGO"[NUMBER,22], "A"."PERF_COUNT"[NUMBER,22], "STG_STATUS"[VARCHAR2,2],        "A"."NA_LOGO"[NUMBER,22], "A"."SESAC_LOGO"[NUMBER,22], "A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30],        "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], "A"."ASCAP_BG"[NUMBER,22], "A"."BMI_BG"[NUMBER,22],        "A"."SESAC_BG"[NUMBER,22], "A"."NA_BG"[NUMBER,22], "A"."PD_BG"[NUMBER,22], "A"."FOREIGN_BG"[NUMBER,22], "A"."ASCAP_BV"[NUMBER,22],        "A"."BMI_BV"[NUMBER,22], "A"."SESAC_BV"[NUMBER,22], "A"."NA_BV"[NUMBER,22], "A"."PD_BV"[NUMBER,22], "A"."FOREIGN_BV"[NUMBER,22],        "A"."ASCAP_THEME"[NUMBER,22], "A"."BMI_THEME"[NUMBER,22], "A"."SESAC_THEME"[NUMBER,22], "A"."NA_THEME"[NUMBER,22],        "A"."PD_THEME"[NUMBER,22], "A"."FOREIGN_THEME"[NUMBER,22], "A"."ASCAP_FEATURE"[NUMBER,22], "A"."BMI_FEATURE"[NUMBER,22],        "A"."SESAC_FEATURE"[NUMBER,22], "A"."NA_FEATURE"[NUMBER,22], "A"."PD_FEATURE"[NUMBER,22], "A"."FOREIGN_FEATURE"[NUMBER,22],        "A"."ASCAP_LOGO"[NUMBER,22], "A"."BMI_LOGO"[NUMBER,22]    3 - "B"."STS_PERF_PERIOD"[VARCHAR2,5], "B"."STS_PERF_PERIOD"[VARCHAR2,5], "B"."STS_DPS_TYPE"[VARCHAR2,10],        "B"."STS_GROUP_NO"[NUMBER,22], "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "C"."STP_S03_PROGRAM_CD"[VARCHAR2,2],        "B"."STS_CHANNEL_TYPE"[VARCHAR2,7]    4 - (#keys=2) "B"."STS_PERF_PERIOD"[VARCHAR2,5], "B"."STS_GROUP_NO"[NUMBER,22], "B"."STS_GROUP_SEQ_NO"[NUMBER,22],

       "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], "B"."STS_DPS_TYPE"[VARCHAR2,10], "C"."STP_S03_PROGRAM_CD"[VARCHAR2,2]    5 - "B"."STS_PROGRAM_NO"[NUMBER,22], "B"."STS_PROVIDER"[VARCHAR2,1], "B"."STS_GROUP_NO"[NUMBER,22],        "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], "B"."STS_DPS_TYPE"[VARCHAR2,10],        "B"."STS_PERF_PERIOD"[VARCHAR2,5]    6 - "B"."STS_PROGRAM_NO"[NUMBER,22], "B"."STS_PROVIDER"[VARCHAR2,1], "B"."STS_GROUP_NO"[NUMBER,22],        "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], "B"."STS_DPS_TYPE"[VARCHAR2,10],        "B"."STS_PERF_PERIOD"[VARCHAR2,5]    7 - "B"."STS_PROGRAM_NO"[NUMBER,22], "B"."STS_PROVIDER"[VARCHAR2,1], "B"."STS_GROUP_NO"[NUMBER,22],        "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], "B"."STS_DPS_TYPE"[VARCHAR2,10],        "B"."STS_PERF_PERIOD"[VARCHAR2,5]    8 - "B".ROWID[ROWID,10], "B"."STS_PERF_PERIOD"[VARCHAR2,5]    9 - "C"."STP_PROGRAM_NO"[NUMBER,22], "C"."STP_PROVIDER"[VARCHAR2,1], "C"."STP_S03_PROGRAM_CD"[VARCHAR2,2]   10 - "A"."STS_PERF_PERIOD"[VARCHAR2,5], "A"."STS_DPS_TYPE"[VARCHAR2,10], "A"."PERF_COUNT"[NUMBER,22], "STG_STATUS"[VARCHAR2,2],        "A"."STG_GROUP_NO"[NUMBER,22], "A"."STG_GROUP_SEQ_NO"[NUMBER,22], "A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30],        "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], "A"."ASCAP_BG"[NUMBER,22], "A"."BMI_BG"[NUMBER,22],        "A"."SESAC_BG"[NUMBER,22], "A"."NA_BG"[NUMBER,22], "A"."PD_BG"[NUMBER,22], "A"."FOREIGN_BG"[NUMBER,22], "A"."ASCAP_BV"[NUMBER,22],        "A"."BMI_BV"[NUMBER,22], "A"."SESAC_BV"[NUMBER,22], "A"."NA_BV"[NUMBER,22], "A"."PD_BV"[NUMBER,22], "A"."FOREIGN_BV"[NUMBER,22],        "A"."ASCAP_THEME"[NUMBER,22], "A"."BMI_THEME"[NUMBER,22], "A"."SESAC_THEME"[NUMBER,22], "A"."NA_THEME"[NUMBER,22],        "A"."PD_THEME"[NUMBER,22], "A"."FOREIGN_THEME"[NUMBER,22], "A"."ASCAP_FEATURE"[NUMBER,22], "A"."BMI_FEATURE"[NUMBER,22],        "A"."SESAC_FEATURE"[NUMBER,22], "A"."NA_FEATURE"[NUMBER,22], "A"."PD_FEATURE"[NUMBER,22], "A"."FOREIGN_FEATURE"[NUMBER,22],        "A"."ASCAP_LOGO"[NUMBER,22], "A"."BMI_LOGO"[NUMBER,22], "A"."SESAC_LOGO"[NUMBER,22], "A"."NA_LOGO"[NUMBER,22],        "A"."PD_LOGO"[NUMBER,22], "A"."FOREIGN_LOGO"[NUMBER,22]   11 - "A"."STS_PERF_PERIOD"[VARCHAR2,5], "A"."STS_DPS_TYPE"[VARCHAR2,10], "A"."PERF_COUNT"[NUMBER,22], "STG_STATUS"[VARCHAR2,2],        "A"."STG_GROUP_NO"[NUMBER,22], "A"."STG_GROUP_SEQ_NO"[NUMBER,22], "A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30],        "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], "A"."ASCAP_BG"[NUMBER,22], "A"."BMI_BG"[NUMBER,22],        "A"."SESAC_BG"[NUMBER,22], "A"."NA_BG"[NUMBER,22], "A"."PD_BG"[NUMBER,22], "A"."FOREIGN_BG"[NUMBER,22], "A"."ASCAP_BV"[NUMBER,22],        "A"."BMI_BV"[NUMBER,22], "A"."SESAC_BV"[NUMBER,22], "A"."NA_BV"[NUMBER,22], "A"."PD_BV"[NUMBER,22], "A"."FOREIGN_BV"[NUMBER,22],        "A"."ASCAP_THEME"[NUMBER,22], "A"."BMI_THEME"[NUMBER,22], "A"."SESAC_THEME"[NUMBER,22], "A"."NA_THEME"[NUMBER,22],        "A"."PD_THEME"[NUMBER,22], "A"."FOREIGN_THEME"[NUMBER,22], "A"."ASCAP_FEATURE"[NUMBER,22], "A"."BMI_FEATURE"[NUMBER,22],        "A"."SESAC_FEATURE"[NUMBER,22], "A"."NA_FEATURE"[NUMBER,22], "A"."PD_FEATURE"[NUMBER,22], "A"."FOREIGN_FEATURE"[NUMBER,22],        "A"."ASCAP_LOGO"[NUMBER,22], "A"."BMI_LOGO"[NUMBER,22], "A"."SESAC_LOGO"[NUMBER,22], "A"."NA_LOGO"[NUMBER,22],        "A"."PD_LOGO"[NUMBER,22], "A"."FOREIGN_LOGO"[NUMBER,22]   Note


  • automatic DOP: skipped because of IO calibrate statistics are missing
  • Warning: basic plan statistics not available. These are only collected when:
    • hint 'gather_plan_statistics' is used for the statement or
    • parameter 'statistics_level' is set to 'ALL', at session or system level

This looks like an ordinary hash join, not a star transformation. Is there any way you could use bitmap indexes to achieve the star transformation, which really the only way that will make such a huge fact/detail query run reasonably fast. If only there was some kind of a physical contraption that could use Bloom filter to quickly eliminate unnecessary blocks from the fact table during the full table scan. And if it use some kind of compression that would lower the number of blocks to be read, that would be an additional boost.

-- 
Mladen Gogala
Oracle DBA
http://mgogala.freehostia.com


--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jan 27 2015 - 18:52:39 CET

Original text of this message