Home » RDBMS Server » Performance Tuning » Performance tuning - Help
Performance tuning - Help [message #155207] Thu, 12 January 2006 16:53 Go to next message
rajendran01
Messages: 2
Registered: January 2006
Junior Member
Hi,
I have a business objects generated SQL that queries a one of the partitions of a partitioned fact table that has
approx. 100 million records. the fact table structure is below and it is range partitioned by YYMM_ID.


DEsc  per_month_ne2_fact
 Name                                      Null?    Type
 ----------------------------------------- -------- ---------------
 SSN                                       NOT NULL VARCHAR2(9)
 REC_ID                                    NOT NULL NUMBER
 YYMM_ID                                   NOT NULL NUMBER
 ELEMENT_ID                                NOT NULL NUMBER
 UNITS_COST                                         NUMBER
 COUNT                                              NUMBER
 ESC_YR1                                            NUMBER
 ESC_YR2                                            NUMBER
 ESC_YR3                                            NUMBER



 In Dev ( Oracle 10gR1 runnig on Tru64 unix ) this takes 18 minutes and in PROD (Oracle 9.2 on Sun Solaris) 
it takes 46 seconds . I have given below the details of running this SQL against DEV and PROD. In DEV it does 
400K+  physical reads  and in PROD it does 15K+ physical reads. 
Can you please give your valuable inputs ?.  Thanks for your help.


//=====================================================================================================//
SQLDEV> set autotrace traceonly
SQLDEV> SELECT
  2    PER_ELEMENT_DIMENSION.ELEMENT_LEVEL1|| '  ' || PER_ELEMENT_DIMENSION.ELEMENT_DESC1,
  3    PER_EMF_OMF_MONTH_DIMENSION_NE.PAY_GRADE,
  4    PER_EMF_OMF_MONTH_DIMENSION_NE.MILITARY_SERVICE,
  5    sum(PER_MONTH_NE2_FACT.UNITS_COST),
  6    count(distinct(PER_MONTH_NE2_FACT.SSN)),
  7    count(distinct(PER_MONTH_NE2_FACT.SSN))/12,
  8    PER_TIME_MONTH_DIMENSION.CY_MONTH
  9  FROM
 10    PER_ELEMENT_DIMENSION,
 11    PER_EMF_OMF_MONTH_DIMENSION  PER_EMF_OMF_MONTH_DIMENSION_NE,
 12    PER_MONTH_NE2_FACT,
 13    PER_TIME_MONTH_DIMENSION
 14  WHERE
 15    ( PER_ELEMENT_DIMENSION.ELEMENT_ID=PER_MONTH_NE2_FACT.ELEMENT_ID  )
 16    AND  ( PER_TIME_MONTH_DIMENSION.YYMM_ID=PER_MONTH_NE2_FACT.YYMM_ID  )
 17    AND  ( PER_EMF_OMF_MONTH_DIMENSION_NE.EMF_ID=PER_MONTH_NE2_FACT.REC_ID  )
 18    AND  ( PER_TIME_MONTH_DIMENSION.FILE_DATE=PER_EMF_OMF_MONTH_DIMENSION_NE.FILE_DATE  )
 19    AND  ( PER_MONTH_NE2_FACT.ELEMENT_ID not in ( 311, 341, 361, 371)  )
 20      AND  (
 21    PER_TIME_MONTH_DIMENSION.CY_MONTH  =  '2004/09'
 22    )
 23  GROUP BY
 24    PER_ELEMENT_DIMENSION.ELEMENT_LEVEL1|| '  ' || PER_ELEMENT_DIMENSION.ELEMENT_DESC1, 
 25    PER_EMF_OMF_MONTH_DIMENSION_NE.PAY_GRADE, 
 26    PER_EMF_OMF_MONTH_DIMENSION_NE.MILITARY_SERVICE, 
 27    PER_TIME_MONTH_DIMENSION.CY_MONTH
 28  /

27 rows selected.

Elapsed: 00:18:00.06

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=19524 Card=11 Bytes=
          891)

   1    0   PX COORDINATOR
   2    1     PX SEND* (QC (RANDOM)) OF ':TQ10005' (Cost=19524 Card=11 :Q1005
           Bytes=891)

   3    2       SORT* (GROUP BY) (Cost=19524 Card=11 Bytes=891)        :Q1005
   4    3         PX RECEIVE* (Cost=19524 Card=11 Bytes=891)           :Q1005
   5    4           PX SEND* (HASH) OF ':TQ10004' (Cost=19524 Card=11  :Q1004
          Bytes=891)

   6    5             SORT* (GROUP BY) (Cost=19524 Card=11 Bytes=891)  :Q1004
   7    6               PX RECEIVE* (Cost=19524 Card=11 Bytes=891)     :Q1004
   8    7                 PX SEND* (HASH) OF ':TQ10003' (Cost=19524 Ca :Q1003
          rd=11 Bytes=891)

   9    8                   SORT* (GROUP BY) (Cost=19524 Card=11 Bytes :Q1003
          =891)

  10    9                     HASH JOIN* (Cost=19522 Card=37970 Bytes= :Q1003
          3075570)

  11   10                       BUFFER* (SORT)                         :Q1003
  12   11                         PX RECEIVE* (Cost=2 Card=55 Bytes=10 :Q1003
          45)

  13   12                           PX SEND* (BROADCAST) OF ':TQ10001'
           (Cost=2 Card=55 Bytes=1045)

  14   13                             TABLE ACCESS (FULL) OF 'PER_ELEM
          ENT_DIMENSION' (TABLE) (Cost=2 Card=55 Bytes=1045)

  15   10                       HASH JOIN* (Cost=19519 Card=38590 Byte :Q1003
          s=2392580)

  16   15                         PX RECEIVE* (Cost=9901 Card=477601 B :Q1003
          ytes=17193636)

  17   16                           PX SEND* (BROADCAST LOCAL) OF ':TQ :Q1002
          10002' (Cost=9901 Card=477601 Bytes=17193636)

  18   17                             HASH JOIN* (Cost=9901 Card=47760 :Q1002
          1 Bytes=17193636)

  19   18                               BUFFER* (SORT)                 :Q1002
  20   19                                 PX RECEIVE* (Cost=2 Card=1 B :Q1002
          ytes=13)

  21   20                                   PX SEND* (PARTITION (KEY))
           OF ':TQ10000' (Cost=2 Card=1 Bytes=13)

  22   21                                     TABLE ACCESS (FULL) OF '
          PER_TIME_MONTH_DIMENSION' (TABLE) (Cost=2 Card=1 Bytes=13)

  23   18                               PX PARTITION RANGE* (ITERATOR) :Q1002
           (Cost=9768 Card=22924861 Bytes=527271803)

  24   23                                 TABLE ACCESS* (FULL) OF 'PER :Q1002
          _EMF_OMF_MONTH_DIMENSION' (TABLE) (Cost=9768 Card=22924861 B
          ytes=527271803)

  25   15                         PX BLOCK* (ITERATOR) (Cost=2712 Card :Q1003
          =22227822 Bytes=577923372)

  26   25                           TABLE ACCESS* (FULL) OF 'PER_MONTH :Q1003
          _NE2_FACT' (TABLE) (Cost=2712 Card=22227822 Bytes=577923372)



   2 PARALLEL_TO_SERIAL
   3 PARALLEL_COMBINED_WITH_PARENT
   4 PARALLEL_COMBINED_WITH_PARENT
   5 PARALLEL_TO_PARALLEL
   6 PARALLEL_COMBINED_WITH_PARENT
   7 PARALLEL_COMBINED_WITH_PARENT
   8 PARALLEL_TO_PARALLEL
   9 PARALLEL_COMBINED_WITH_PARENT
  10 PARALLEL_COMBINED_WITH_PARENT
  11 PARALLEL_COMBINED_WITH_CHILD
  12 PARALLEL_COMBINED_WITH_PARENT
  13 PARALLEL_FROM_SERIAL
  15 PARALLEL_COMBINED_WITH_PARENT
  16 PARALLEL_COMBINED_WITH_PARENT
  17 PARALLEL_TO_PARALLEL
  18 PARALLEL_COMBINED_WITH_PARENT
  19 PARALLEL_COMBINED_WITH_CHILD
  20 PARALLEL_COMBINED_WITH_PARENT
  21 PARALLEL_FROM_SERIAL
  23 PARALLEL_COMBINED_WITH_CHILD  SELECT distinct TBL$OR$IDX$PART$NUM("PER_EMF
                                   _OMF_MONTH_DIMENSION", 0, d#, p#, "F

  24 PARALLEL_COMBINED_WITH_PARENT
  25 PARALLEL_COMBINED_WITH_CHILD  SELECT distinct TBL$OR$IDX$PART$NUM("PER_MON
                                   TH_NE2_FACT", 0, d#, p#, "YYMM_ID")

  26 PARALLEL_COMBINED_WITH_PARENT


Statistics
----------------------------------------------------------
        366  recursive calls
       1390  db block gets
       8739  consistent gets
     409477  physical reads
          0  redo size
       2049  bytes sent via SQL*Net to client
        437  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
         37  sorts (memory)
          8  sorts (disk)
         27  rows processed

SQLdDEV>  

SQLDEV> sho sga

Total System Global Area  419430400 bytes
Fixed Size                  1302224 bytes
Variable Size             417784112 bytes
Database Buffers                  0 bytes
Redo Buffers                 344064 bytes

sort_area_size       8388608  

sort_area_retained_size  2097152


db_cache_size             0

db_2k_cache_size          0

db_4k_cache_size          0

db_8k_cache_size          0

db_16k_cache_size         0

db_32k_cache_size         0

db_keep_cache_size        0

db_recycle_cache_size     0


sga_max_size           419430400

pre_page_sga             FALSE

lock_sga                 FALSE

sga_target               0






//=====================================================================================================================/


SQLPROD> set autotrace traceonly
SQLPROD> SELECT
  2    PER_ELEMENT_DIMENSION.ELEMENT_LEVEL1|| '  ' || PER_ELEMENT_DIMENSION.ELEMENT_DESC1,
  3    PER_EMF_OMF_MONTH_DIMENSION_NE.PAY_GRADE,
  4    PER_EMF_OMF_MONTH_DIMENSION_NE.MILITARY_SERVICE,
  5    sum(PER_MONTH_NE2_FACT.UNITS_COST),
  6    count(distinct(PER_MONTH_NE2_FACT.SSN)),
  7    count(distinct(PER_MONTH_NE2_FACT.SSN))/12,
  8    PER_TIME_MONTH_DIMENSION.CY_MONTH
  9  FROM
 10    PER_ELEMENT_DIMENSION,
 11    PER_EMF_OMF_MONTH_DIMENSION  PER_EMF_OMF_MONTH_DIMENSION_NE,
 12    PER_MONTH_NE2_FACT,
 13    PER_TIME_MONTH_DIMENSION
 14  WHERE
 15    ( PER_ELEMENT_DIMENSION.ELEMENT_ID=PER_MONTH_NE2_FACT.ELEMENT_ID  )
 16    AND  ( PER_TIME_MONTH_DIMENSION.YYMM_ID=PER_MONTH_NE2_FACT.YYMM_ID  )
 17    AND  ( PER_EMF_OMF_MONTH_DIMENSION_NE.EMF_ID=PER_MONTH_NE2_FACT.REC_ID  )
 18    AND  ( PER_TIME_MONTH_DIMENSION.FILE_DATE=PER_EMF_OMF_MONTH_DIMENSION_NE.FILE_DATE  )
 19    AND  ( PER_MONTH_NE2_FACT.ELEMENT_ID not in ( 311, 341, 361, 371)  )
 20      AND  (
 21    PER_TIME_MONTH_DIMENSION.CY_MONTH  =  '2004/09'
 22    )
 23  GROUP BY
 24    PER_ELEMENT_DIMENSION.ELEMENT_LEVEL1|| '  ' || PER_ELEMENT_DIMENSION.ELEMENT_DESC1, 
 25    PER_EMF_OMF_MONTH_DIMENSION_NE.PAY_GRADE, 
 26    PER_EMF_OMF_MONTH_DIMENSION_NE.MILITARY_SERVICE, 
 27    PER_TIME_MONTH_DIMENSION.CY_MONTH
 28  /

27 rows selected.

Elapsed: 00:00:41.07

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=804 Card=11 Bytes=95
          7)

   1    0   SORT* (GROUP BY) (Cost=804 Card=11 Bytes=957)              :Q111030
                                                                       05

   2    1     SORT* (GROUP BY) (Cost=804 Card=11 Bytes=957)            :Q111030
                                                                       04

   3    2       SORT* (GROUP BY) (Cost=804 Card=11 Bytes=957)          :Q111030
                                                                       03

   4    3         HASH JOIN* (Cost=787 Card=50382 Bytes=4383234)       :Q111030
                                                                       03

   5    4           TABLE ACCESS* (FULL) OF 'PER_ELEMENT_DIMENSION' (C :Q111030
          ost=6 Card=55 Bytes=1045)                                    01

   6    4           HASH JOIN* (Cost=780 Card=51205 Bytes=3481940)     :Q111030
                                                                       03

   7    6             NESTED LOOPS* (Cost=65 Card=483127 Bytes=1739257 :Q111030
          2)                                                           02

   8    7               TABLE ACCESS* (FULL) OF 'PER_TIME_MONTH_DIMENS :Q111030
          ION' (Cost=6 Card=1 Bytes=13)                                00

   9    7               PARTITION RANGE* (ITERATOR)                    :Q111030
                                                                       02

  10    9                 TABLE ACCESS* (FULL) OF 'PER_EMF_OMF_MONTH_D :Q111030
          IMENSION' (Cost=59 Card=1 Bytes=23)                          02

  11    6             PARTITION RANGE* (ITERATOR)                      :Q111030
                                                                       03

  12   11               TABLE ACCESS* (FULL) OF 'PER_MONTH_NE2_FACT' ( :Q111030
          Cost=585 Card=22227822 Bytes=711290304)                      03



   1 PARALLEL_TO_SERIAL            SELECT /*+ CIV_GB */ A1.C0,A1.C1,A1.C2,A1.C3
                                   ,COUNT(DISTINCT SYS_OP_CSR(A1.C4,0))

   2 PARALLEL_TO_PARALLEL          SELECT /*+ TIV_GB */ A1.C0 C0,A1.C1 C1,A1.C2
                                    C2,A1.C3 C3,SYS_OP_MSR(COUNT(DISTIN

   3 PARALLEL_TO_PARALLEL          SELECT /*+ PIV_GB */ A1.C1||'  '||A1.C0 C0,A
                                   1.C2 C1,A1.C5 C2,A1.C4 C3,SYS_OP_MSR

   4 PARALLEL_COMBINED_WITH_PARENT
   5 PARALLEL_FROM_SERIAL
   6 PARALLEL_COMBINED_WITH_PARENT
   7 PARALLEL_TO_PARALLEL          SELECT /*+ ORDERED NO_EXPAND USE_NL(A2) */ A
                                   1.C0 C0,A2.C0 C1,A2.C3 C2,A1.C1 C3,A

   8 PARALLEL_FROM_SERIAL
   9 PARALLEL_COMBINED_WITH_PARENT
  10 PARALLEL_COMBINED_WITH_PARENT
  11 PARALLEL_COMBINED_WITH_PARENT SELECT distinct TBL$OR$IDX$PART$NUM("PER_MON
                                   TH_NE2_FACT", 0, d#, p#, "YYMM_ID")

  12 PARALLEL_COMBINED_WITH_PARENT


Statistics
----------------------------------------------------------
         10  recursive calls
         16  db block gets
       7655  consistent gets
      15060  physical reads
          0  redo size
       1741  bytes sent via SQL*Net to client
        282  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
         78  sorts (memory)
          0  sorts (disk)
         27  rows processed

SQLPROD> sho sga

Total System Global Area  753190408 bytes
Fixed Size                   730632 bytes
Variable Size             620756992 bytes
Database Buffers          131072000 bytes
Redo Buffers                 630784 bytes

sort_area_size       8388608  

sort_area_retained_size  2097152


db_cache_size             0

db_2k_cache_size          0

db_4k_cache_size          0

db_8k_cache_size          0

db_16k_cache_size         0

db_32k_cache_size         0

db_keep_cache_size        0

db_recycle_cache_size     0


sga_max_size           753190408

pre_page_sga             FALSE

lock_sga                 FALSE

sga_target               0



//======================================================================================================================/

[Updated on: Fri, 13 January 2006 07:32] by Moderator

Report message to a moderator

Re: Performance tuning - Help [message #155269 is a reply to message #155207] Fri, 13 January 2006 07:50 Go to previous messageGo to next message
Mahesh Rajendran
Messages: 10707
Registered: March 2002
Location: oracleDocoVille
Senior Member
Account Moderator
Your devlopment is 10g and production is 9i.
So, this is not a fair comparison. 10g CBO is very diffent from 9i.
sga_max_size is different.
How did you refresh the data from prod to dev?
How did you gather statistics in 10g and in 9i?
Re: Performance tuning - Help [message #155298 is a reply to message #155207] Fri, 13 January 2006 11:49 Go to previous messageGo to next message
smartin
Messages: 1803
Registered: March 2005
Location: Jacksonville, Florida
Senior Member
On top of what Mahesh said, also interesting is that consistent gets are very similar in your two autotrace results, it is only the physical that shows the dramatic change. Possible differences in hardware platforms? in system configuration? statistics and cpu costing? Parallel processing parameters and config?

Also why an sga target of ZERO in 10g?

Also for comparison purposes, you may want to try them both in serial mode, no parallel, at least to compare the execution plans. Plans are much more complicated when you involve parallel, and you may be able to spot some useful differences.
Re: Performance tuning - Help [message #155947 is a reply to message #155207] Thu, 19 January 2006 16:40 Go to previous messageGo to next message
SQLAREA
Messages: 21
Registered: January 2006
Location: Belgium
Junior Member
Hi

Since the results of both queries are equal and very likely the arraysize and the blocksize are equal it is rather normal the number of consistent gets are similar.

sga_target=0 and db_cache_size = 0
You must have configured the old fashionned db_block_buffers
Q1. Are they equal on both servers ?

On prod you have more sorts on disk
Q2 Are you running dedicated servers and do you work with workarea_size_policy=auto ?
If the answer is yes , is the pga_aggregate_target equally sized on both servers ?
If you are running dedicated servers and if you work with workarea_size_policy=manual, > is the hash_area_size the same on both platforms.
Your hash joins seems to be far more expensive on your prod server.

Q3. Since you face full table scans. Have you compared the the number of blocks in dba_tables for the tables concerned ?

Q4. Since you face full table scans is db_file_multiblock_read count equally sized on both platforms ?

Q5. I assume indexes are the same

Regards
Guy Lambregts
Re: Performance tuning - Help [message #156073 is a reply to message #155207] Fri, 20 January 2006 11:25 Go to previous messageGo to next message
rajendran01
Messages: 2
Registered: January 2006
Junior Member
Thanks all for your inputs.

In Dev the work_area_policy was set to auto .

I set work_area_policy = auto
sort_area_size = 16mb (was 8 mb)
hash_area_size =32 mb ( was 16 mb)

and ran the query. Now the physical reads (reduced to 8k+ from 400k +) and consistent gets are caomparable to the prod .

The query now takes 1 min 11 secs . it took 18 minutes before.

I greatly appreciate your help.

Raj
Re: Performance tuning - Help [message #156117 is a reply to message #155207] Sun, 22 January 2006 06:52 Go to previous message
SQLAREA
Messages: 21
Registered: January 2006
Location: Belgium
Junior Member
Raj,

Glad we were able to help you.

However let' s take a look to my question 2
...
Q2 Are you running dedicated servers and do you work with workarea_size_policy=auto ?
If the answer is yes , is the pga_aggregate_target equally sized on both servers ?
...
If you are running with <b>workarea_size_policy=auto</b> the sizing of the workarea' s for sorts, hash_joins, ... is done by the <b>pga_aggregate_target</b> and not by sort_area_size neither hash_area_size assuming users connect in <b>dedicated server</b> mode "select username,server from v$session".

If you are running with <b>workarea_size_policy=auto</b> -as far as I know- the sizing of the sort_area, hash_area is only done with sort_area_size, hash_area_size for users connect in <b>shared server</b> mode.

Regards
Guy Lambregts
Previous Topic: V$LibraryCache
Next Topic: Tuning: multiple table inner join query with partitioning
Goto Forum:
  


Current Time: Thu Apr 25 10:44:18 CDT 2024