Re: IO wait during backup

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Tue, 12 May 2020 13:46:36 -0400
Message-ID: <f078e440-2ba8-cad7-1bcb-a7c206803957_at_gmail.com>



Looks like too much parallelism. Your backup is waiting on both reads and writes but mostly on reads. How many channels are you using? Also, it looks like the array is VNX. How is it attached? Is it iSCSI or F/C? If it's iSCSI, do you have iSCSI HBA boards or just plain simple 10 Gb Ethernet? VNX is not an enterprise class array for DB servers. You should consider something like Pure (cheaper) or XTremIO (more expensive).

On 5/12/20 12:56 AM, manikandan wrote:
> Hi,
>
> Env is 2 node RAC (12.2 GI + RDBMS) on Linux.
>
> We are seeing significant amount of IO wait during level 0 backups
> (weekly once) and level 1 backups (daily). RMAN backup is running on
> Node A and Golden Gate is running on Node B. The below are some artifacts
>
> *iostat
> ------*
>
> *Node A*
>
> Device:         rrqm/s   wrqm/s     r/s   w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>
> sda               0.00     0.00    0.00   69.00     0.00 600.50  
>  17.41     0.00    0.03    0.00    0.03   0.03   0.20
>
> sdb               0.00     0.00    0.00    0.00     0.00 0.00     0.00
>     0.00    0.00    0.00    0.00   0.00   0.00
>
> sdc               0.00     0.00    7.00    0.00  3584.00 0.00  1024.00
>     1.29  165.57  165.57    0.00  71.86  50.30
>
> sdd               0.00     0.00    6.00    0.00  3088.00 0.00  1029.33
>     1.00  213.67  213.67    0.00  39.00  23.40
>
> sde               0.00     0.00   14.00    0.00  1541.50 0.00   220.21
>     0.91   49.71   49.71    0.00  33.93  47.50
>
> sdf               0.00     0.00    4.00    0.00  3072.00 0.00  1536.00
>     0.84  176.25  176.25    0.00 144.75  57.90
>
> sdg               0.00     0.00    5.00    0.00  2560.00 0.00  1024.00
>     0.25   70.40   70.40    0.00  40.80  20.40
>
> sdh               0.00     0.00    6.00    0.00  3072.00 0.00  1024.00
>     1.15  166.83  166.83    0.00  73.33  44.00
>
> sdi               0.00     0.00    7.00    0.00  3584.00 0.00  1024.00
>     1.08  129.43  129.43    0.00  62.57  43.80
>
> sdj               0.00     0.00    8.00    0.00  4096.00 0.00  1024.00
>     0.46  120.50  120.50    0.00  29.75  23.80
>
> sdk               0.00     0.00   10.00    0.00  5120.00 0.00  1024.00
>     0.97   86.20   86.20    0.00  58.40  58.40
>
> sdl               0.00     0.00   11.00    0.00  5120.00 0.00   930.91
>     1.25   83.91   83.91    0.00  48.09  52.90
>
> sdm               0.00     0.00    7.00    0.00  3080.00 0.00   880.00
>     0.89  198.00  198.00    0.00  65.00  45.50
>
> sdn               0.00     0.00   10.00    0.00  4096.50 0.00   819.30
>     0.65   76.20   76.20    0.00  41.30  41.30
>
> sdo               0.00     0.00   16.00    0.00  7168.00 0.00   896.00
>     1.21   69.94   69.94    0.00  35.00  56.00
>
> ………….
>
> ………….
>
> emcpowerc         0.00     0.00   50.00    1.00 24608.00  16.00  
> 965.65     3.28   64.35   64.90   37.00  10.39  53.00
>
> emcpowerf         0.00     0.00   52.00    0.00 24632.00 0.00   947.38
>     2.86   55.02   55.02    0.00   9.60  49.90
>
> emcpowerd         0.00     0.00  140.00    0.00 28714.00 0.00   410.20
>     6.93   49.21   49.21    0.00   7.10  99.40
>
> emcpowerb         0.00     0.00   64.00    0.00 32768.00 0.00  1024.00
>     3.56   55.62   55.62    0.00   5.81  37.20
>
> emcpowere         0.00     0.00   51.00    0.00 24624.00 0.00   965.65
>     2.66   52.14   52.14    0.00   7.73  39.40
>
> emcpowerg         0.00     0.00   64.00    0.00 32768.00 0.00  1024.00
>     3.42   56.77   56.77    0.00   5.77  36.90
>
> emcpowerbw        0.00     0.00   59.00    0.00 28696.00 0.00   972.75
>     3.83   56.97   56.97    0.00   9.34  55.10
>
> emcpowerh         0.00     0.00   57.00    0.00 28688.00 0.00  1006.60
>     3.15   52.25   52.25    0.00   6.16  35.10
>
> emcpowerbv        0.00     0.00   56.00    0.00 28672.00 0.00  1024.00
>     3.49   56.66   56.66    0.00   8.18  45.80
>
> emcpowerj         0.00     0.00   65.00    0.00 32776.00 0.00  1008.49
>     3.64   56.15   56.15    0.00   6.46  42.00
>
> emcpowerbu        0.00     0.00   64.00    0.00 32768.00 0.00  1024.00
>     3.38   56.25   56.25    0.00   5.58  35.70
>
> emcpoweri         0.00     0.00   64.00    0.00 32768.00 0.00  1024.00
>     3.81   56.34   56.34    0.00   7.50  48.00
>
> emcpowerbt        0.00     0.00   57.00    0.00 28680.00 0.00  1006.32
>     3.10   54.84   54.84    0.00   7.21  41.10
>
> emcpowerl         0.00     0.00   56.00    0.00 28672.00 0.00  1024.00
>     2.91   51.95   51.95    0.00   7.25  40.60
>
> emcpowerbs        0.00     0.00   67.00    0.00 34304.00 0.00  1024.00
>     3.76   57.18   57.18    0.00   5.25  35.20
>
>
>
> *Node B
> ------*
>
> Device:         rrqm/s   wrqm/s     r/s   w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>
> sda               0.00     0.00    0.00    0.00     0.00 0.00     0.00
>     0.00    0.00    0.00    0.00   0.00   0.00
>
> sdez              0.00     0.00    0.00    0.00     0.00 0.00     0.00
>     0.00    0.00    0.00    0.00   0.00   0.00
>
> sdfa              0.00     0.00   18.00    0.00   128.00  24.00  
>  16.89     1.24   82.00   82.00    0.00  39.00  70.20
>
> sdfb              0.00     0.00   17.00    1.00   120.00  16.00  
>  15.11     1.41   73.50   73.00   82.00  45.33  81.60
>
> sdfc              0.00     0.00    8.00    1.00    40.00  16.00  
>  12.44     0.90   76.22   75.75   80.00  56.22  50.60
>
> sdfd              0.00     0.00   10.00    1.00    80.00  16.00  
>  17.45     0.69   71.36   70.80   77.00  31.36  34.50
>
> sdfe              0.00     0.00   12.00    3.00    96.00  24.00  
>  16.00     0.53   70.00   70.08   69.67  22.93  34.40
>
> sdff              0.00     0.00   12.00    0.00    88.00 4.00    15.33
>     0.84   93.50   93.50    0.00  42.25  50.70
>
> sdfg              0.00     0.00    8.00    2.00    64.00 136.00  
>  40.00     0.61   66.40   66.12   67.50  30.80  30.80
>
> sdfh              0.00     0.00   20.00    1.00   120.00 8.00    12.19
>     1.62   73.19   74.05   56.00  39.29  82.50
>
> sdfi              0.00     0.00    9.00    0.00    64.00 0.00    14.22
>     0.54   92.67   92.67    0.00  42.89  38.60
>
> …………
>
> ………..
>
> emcpoweraw        0.00     0.00   14.00    1.00   112.00  16.00  
>  17.07     3.69  291.93  292.36  286.00  61.40  92.10
>
> emcpowerav        0.00     0.00    9.00    0.00    72.00 0.00    16.00
>     2.58  291.78  291.78    0.00  94.44  85.00
>
> emcpowerax        0.00     0.00    5.00    2.00    40.00  16.00  
>  16.00     2.66  288.71  291.60  281.50 131.57  92.10
>
> emcpowerbi        0.00     0.00    5.00    1.00    40.00 8.00    16.00
>     1.96  297.17  298.80  289.00 139.67  83.80
>
> emcpowerbh        0.00     0.00    8.00    0.00    64.00 0.00    16.00
>     2.43  293.88  293.88    0.00 115.25  92.20
>
> emcpowerbg        0.00     0.00    5.00    0.00    40.00 0.00    16.00
>     1.60  289.40  289.40    0.00 179.60  89.80
>
> emcpowerbf        0.00     0.00    5.00    0.00    40.00 0.00    16.00
>     2.07  293.40  293.40    0.00 184.00  92.00
>
> emcpowerbe        0.00     0.00   11.00    0.00    88.00 0.00    16.00
>     3.12  294.18  294.18    0.00  89.27  98.20
>
> emcpowerbd        0.00     0.00    4.00    1.00    32.00 8.00    16.00
>     0.85  283.40  279.75  298.00 132.80  66.40
>
> emcpowerbc        0.00     0.00    6.00    1.00    48.00 8.00    16.00
>     2.29  288.71  289.67  283.00 128.43  89.90
>
> emcpowerbb        0.00     0.00   10.00    0.00    80.00 0.00    16.00
>     3.02  291.80  291.80    0.00 100.00 100.00
>
> emcpowerba        0.00     0.00   10.00    0.00    80.00 0.00    16.00
>     2.83  292.80  292.80    0.00  84.20  84.20
>
> emcpoweraz        0.00     0.00    8.00    1.00    64.00 8.00    16.00
>     2.26  291.89  291.75  293.00 111.11 100.00
>
> emcpoweray        0.00     0.00    5.00    1.00    40.00 8.00    16.00
>     1.25  288.67  286.80  298.00  94.50  56.70
>
> *vmstat
> ------*
>
> *Node A*
>
> SNAP_INTERVAL 30
>
> CPU_CORES 18
>
> VCPUS 72
>
> OSWBB_ARCHIVE_DEST /ora45/dbworkspace/OSWATCHER/oswbb/archive
>
> zzz ***Sat May 9 01:00:15 EDT 2020
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
>
> r  b   swpd   free   buff  cache   si   so    bi    bo   in cs us sy
> id wa st
>
> 6 123      0 32364372 303780 40037240    0    0   561   153  0    0  3
>  1 94  3  0
>
> 1 125      0 32371232 303780 40037456    0    0 41317 30810 35987
> 45850  4  1 27 67  0
>
> 5 123      0 32371060 303780 40037644    0    0 30972 23008 43202
> 56629  3  1 27 69  0
>
> zzz ***Sat May 9 01:00:45 EDT 2020
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
>
> r  b   swpd   free   buff  cache   si   so    bi    bo   in cs us sy
> id wa st
>
> 4 119      0 32338792 303780 40038420    0    0   561   153  0    0  3
>  1 94  3  0
>
> 2 124      0 32329172 303780 40038840    0    0 12392  9241 18276
> 20114  3  1 30 66  0
>
> 3 118      0 32334116 303780 40038836    0    0 16690 11341 17579
> 19183  3  2 32 64  0
>
> *Node B*
>
> SNAP_INTERVAL 30
>
> CPU_CORES 18
>
> VCPUS 72
>
> OSWBB_ARCHIVE_DEST /ora45/dbworkspace/OSWATCHER/oswbb/archive
>
> zzz ***Sat May 9 01:00:15 EDT 2020
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
>
> r  b   swpd   free   buff  cache   si   so    bi    bo   in cs us sy
> id wa st
>
> 6 123      0 32364372 303780 40037240    0    0   561   153  0    0  3
>  1 94  3  0
>
> 1 125      0 32371232 303780 40037456    0    0 41317 30810 35987
> 45850  4  1 27 67  0
>
> 5 123      0 32371060 303780 40037644    0    0 30972 23008 43202
> 56629  3  1 27 69  0
>
> zzz ***Sat May 9 01:00:45 EDT 2020
>
> procs -----------memory---------- ---swap-- -----io---- -system--
> ------cpu-----
>
> r  b   swpd   free   buff  cache   si   so    bi    bo   in cs us sy
> id wa st
>
> 4 119      0 32338792 303780 40038420    0    0   561   153  0    0  3
>  1 94  3  0
>
> 2 124      0 32329172 303780 40038840    0    0 12392  9241 18276
> 20114  3  1 30 66  0
>
> 3 118      0 32334116 303780 40038836    0    0 16690 11341 17579
> 19183  3  2 32 64  0
>
>
> *From AWR during Level 0 backup*
>
> *Top 10 Foreground Events by Total Wait Time*
>
> *Event*
>
>
>
> *Waits*
>
>
>
> *Total Wait Time (sec)*
>
>
>
> *Avg Wait*
>
>
>
> *% DB time*
>
>
>
> *Wait Class*
>
> db file sequential read
>
>
>
> 124,916
>
>
>
> 8451.4
>
>
>
> 67.66ms
>
>
>
> 83.9
>
>
>
> User I/O
>
> control file sequential read
>
>
>
> 25,283
>
>
>
> 142.3
>
>
>
> 5.63ms
>
>
>
> 1.4
>
>
>
> System I/O
>
> DB CPU
>
>
>
>
>
> 87.3
>
>
>
>
>
> .9
>
>
>
> log file sync
>
>
>
> 158
>
>
>
> 41
>
>
>
> 259.41ms
>
>
>
> .4
>
>
>
> Commit
>
> gc current block 2-way
>
>
>
> 131,697
>
>
>
> 13.7
>
>
>
> 104.37us
>
>
>
> .1
>
>
>
> Cluster
>
> name-service call wait
>
>
>
> 82
>
>
>
> 12.4
>
>
>
> 150.77ms
>
>
>
> .1
>
>
>
> Other
>
> gc cr grant 2-way
>
>
>
> 62,345
>
>
>
> 7.3
>
>
>
> 117.49us
>
>
>
> .1
>
>
>
> Cluster
>
> cursor: pin S wait on X
>
>
>
> 617
>
>
>
> 5.8
>
>
>
> 9.40ms
>
>
>
> .1
>
>
>
> Concurrency
>
> Sync ASM rebalance
>
>
>
> 9
>
>
>
> 4.9
>
>
>
> 545.95ms
>
>
>
> .0
>
>
>
> Other
>
> library cache lock
>
>
>
> 446
>
>
>
> 3.8
>
>
>
> 8.54ms
>
>
>
> .0
>
>
>
> Concurrency
>
>
>
> I tried to collect MML wait during Level 1 backup window as I could
> not collect the same during Level 0 backup.
>
> *Before starting Level 0*
>
>
>
> Backup: MML write backup piece 301955.357264
>
> Backup: MML initialization             0.001998
>
> Backup: MML delete backup piece        79.890461
>
> Backup: MML extended initialization    218.460778
>
> Backup: MML commit backup piece        452.491904
>
> Backup: MML shutdown                   642.291437
>
> Backup: MML create a backup piece      52.15988
>
> Backup: MML query backup piece         159.223808
>
>
>
> Total                                  303559.87753
>
>
>
> *After*
>
>
>
> Backup: MML write backup piece 304953.207321
>
> Backup: MML initialization             0.002076
>
> Backup: MML delete backup piece        79.890461
>
> Backup: MML extended initialization    227.604029
>
> Backup: MML commit backup piece        480.266509
>
> Backup: MML shutdown                   657.297723
>
> Backup: MML create a backup piece      83.378908
>
> Backup: MML query backup piece         159.562624
>
>
>
> Total                                  306650.213124
>
>
>
>
>
> 3090.34             MML           (306650.213124 - 303559.87753)
>
> 4392                Elapsed       (2540 + 1852)
>
> 1301.66             RMAN         (Elapsed - MML)
>
> 70.36               Percent spent on MML            70% time spent on MML
>
>
>
> Archive log backup also started during ID backup, so elapsed time
> calculated as sum of ID and Archive.
>
>
>
> 5/10/2020 11:01:55 PM     5/10/2020 11:44:15 PM     3374.81      
> 313.02 DB INCR      2540
>
> 5/10/2020 11:13:04 PM     5/10/2020 11:43:56 PM     105.58    105.59  
>      ARCHIVELOG   1852
>
>
> We only see this high load on the server when the backups are running.
>
> I am trying to isolate where is the IO bottleneck before i talk to
> system admin, is it on Server side or storage side ? . Also is there
> any utilities i can use to monitor IO waits as a non-root user. Please
> suggest.
>
> Thanks,
> Mani

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217


--
http://www.freelists.org/webpage/oracle-l
Received on Tue May 12 2020 - 19:46:36 CEST

Original text of this message