Re: Performance of Multiple Inserts

From: <Will>
Date: 1995/08/11
Message-ID: <40eee7$9il_at_data.interserv.net>#1/1


> droberts_at_seas.gwu.edu (David Roberts) writes:
> A friend of mine has written a simple Oracle program that he thinks
> should run in a couple of hours, that now appears to need 48 days to
> complete! I offered to ask for advice here....
>
> The program is a simple load utility, loading a lot of payroll data into
> a database. The program is generating test data, so it just does
> inserts. At present, it COMMITs after every insert. In an effort to
> speed things up, the program was run simultaneously from four clients
> accessing the same server, and there was about 3 x speedup, but we are
> still talking about 16 days to build our test database!
>
> A couple of questions--
>
> 1. I suspected that the program might run faster if it committed about a
> whole pageful of data and then COMMITted, rather than issuing a COMMIT
> after each INSERT. What do you think?
>
> 2. Would that same strategy help for the multiple-client case? Are the
> multiple clients each going to work with the same page of data, hence
> have to wait on each other, or will Oracle give each of them a separate page?
>
> Thanks
>
> Dave Roberts
>
>>>>

Here are a couple of ideas:

If you are inserting only (i.e. not updating rows), use SQL*Loader, preferrably with the direct-path option. I know of a site that loaded 60 million rows in 45 minutes using this option.

If you need to update the rows if they exist, otherwise insert, use Pro*C with PL/SQL and array processing. Here's some pseudocode:

while( not done )
{ read 500 records

   exec sql execute

      begin
         for i = 1 .. 500 loop
            update table
            set col1 = :col1(i),
                   col2 = :col2(i),
                      .
                      .
            where pk_col = :pk_col(i);

            if sql%rowcount = 0 then
                insert into table
                   (col1, col2, ...)
                values
                   (:col1(i), :col2(i));
            end if;
         end loop;

         commit;
      end;

   end-exec;
)

Array processing helps performance a lot. I've seen it reduce execution time in a networked environment by 2-3 times.

By all means, do not commit after each insert. You should see a significant improvement by "batching" commits.

You can drop the indexes, load the data, and recreate the indexes, but this will only improve performance a little. If you drop the indexes, it's always possible they might not get recreated. I'd trade peace of mind and simplicity for a couple of percent of performance any day.

A BIG mistake to avoid is delete/insert. I've seen a lot of code delete the row they want to insert (in case it already exists), and then insert the row. This will hurt your performance a lot. If you have existing data in the table, it's much faster to update the row, and then insert if not found like in the example above.

Another thing to check is indexes. If you are doing an update...where, make sure you are doing indexed searches on the update..

Hope this all makes sense.

If you have any questions, feel free to email me.

Will Kooiman,
Computer Systems Authority. Received on Fri Aug 11 1995 - 00:00:00 CEST

Original text of this message