Forall performance

From: Vince <vinnyop_at_yahoo.com>
Date: 13 Feb 2003 10:10:18 -0800
Message-ID: <56e2f55a.0302131010.544f53cc@posting.google.com>

Hello all.

I am working on a batch interface of data. The size of the batch averages about 150,000 rows. It is important that each record be evaluated separately. As such, I wrote code to fetch a single row from a temp table, perform validation and a little scrubbing, then insert the row and repeat x150k. I do this to trap errors at the row level.

Configuration:
Oracle 8.1.7
Linux
SGA :
  Shared: 278MB
  Buffer Cache: 156MB
  Large Pool: 1MB

Wanting to improve the performance I started testing with building tables and using the FORALL statement. Everything remained the same except instead of inserting the row into the db table, I inserted a row into pl/sql tables (have to use many as 8.1.7. does not allow FORALL with record type tables). At the end of the procedure is where the FORALL now resides.

What I observe is no performance gain or an insignificant difference. So I thought extensively on what could be the problem. All the examples posted here, Ask Tom, etc. show significant gains, but all the examples use simple one or two column insert statements. My FORALL has 15 columns and looks like this:

FORALL i IN p_starting_point..temp_ma_id.COUNT

      INSERT /*+ APPEND */ INTO sometable
      VALUES (
        temp_id(i),              --number(18)   -values are 10 digits
        temp_ma_id(i),           --number(18)   -values are 10 digits
        temp_amt(i),             --number(9,2)
        temp_cdm_id(i),          --number(18)   -values are 10 digits
        temp_hcpcs(i),           --varchar2(5)
        temp_mod1(i),            --varchar2(2)
        temp_mod2(i),            --varchar2(2)
        temp_mod3(i),            --varchar2(2)
        temp_mod4(i),            --varchar2(2)
        temp_post(i),            --date
        temp_qty(i),             --number(4)
        temp_svcdt(i),           --date
        temp_revcd(i),           --varchar2(3)
        temp_datec(i),           --date
        NULL
             );

The target table has about 9 million rows and has a single column primary key/index and 2 single column foreign key indexes. The table is growing at somewhere about 1 million per month.

I then tweaked the code a bit and fire the FORALL insert after x rows (then empty the array) and saw a significant improvement in performance (about 40% better: 150k rows now takes about 3.2 minutes). I varied x (between 1,000 all the way to 50,000) to find the optimal value and found the smallest, 1,000 performs best. I haven't tried values less than that.

I was somewhat surprised that this would be necessary, although there is an obvious tradeoff of memory at stake.

Does anyone else have similar results? Are there other considerations for FORALL usage? Received on Thu Feb 13 2003 - 12:10:18 CST