Re: Inserting with billion of rows faster

From: Lok P <loknath.73_at_gmail.com>
Date: Sun, 28 Mar 2021 11:14:02 +0530
Message-ID: <CAKna9VbxaRjTPKvsevSdtvr6DTpaJhVkx=SuP42QkcbOC+YCkQ_at_mail.gmail.com>



The delete which we are now using is something as below for removing the duplicates from the target, but it's taking a lot of time, So thinking if there is any faster way to achieve it?

DECLARE     type rowidArray is table of rowid index by binary_integer;

    type emRec is record

    (rowid rowidArray);

    l_record emRec;

    l_array_size number:=10000;

    l_done boolean;

    l_cnt number:=1;

    l_processed number:=0;

    cursor c is select

                       rp.rowid

                       from dp, rp

                       where   .....................;


BEGIN     dbms_application_info.set_module('Deletes ','Starting...');

    open c;

    loop

        dbms_application_info.set_action('Processing '||l_cnt||' thru '||(l_cnt+l_array_size-1)||' ('||l_processed||')');

        fetch c bulk collect into l_record.rowid LIMIT l_array_size;

        l_done := c%notfound;

        forall i in 1 .. l_record.rowid.count

            delete from ...........

             where rowid = l_record.rowid(i);

         l_processed:=l_processed + SQL%ROWCOUNT;

        commit;

        exit when (l_done);

        l_cnt := l_cnt + l_array_size;

    end loop;

    dbms_application_info.set_action('Processed '||l_processed||' deletes FROM .......'); end;

/

On Sun, Mar 28, 2021 at 11:07 AM Lok P <loknath.73_at_gmail.com> wrote:

> Hi Listers, we have one process in which we load ~50million daily (using
> direct path insert.. INSERT APPEND ) to a target table which holds
> ~20billion rows(~1TB in size)) and is list -hash composite partitioned. ~2
> list partition and ~2048 hash subpartitions. It has 4 indexes out of those
> one is a composite primary key comprising of 5 columns.We have that data
> load job failing since ~15-20 days without notice and it thus accumulated
> ~billion rows and the load process is now failing with (Ora-01628 max
> extent 32765 reached for rollback segment).
>
> So we thought of running it in a bulk collect method(append_values hint)
> and commit in chunks of ~1million. But during that process we endup seeing
> the data load is significantly slower , it was inserting ~1million rows in
> ~40minutes. And we were not able to understand the reason but the wait
> events were all showing "cell single block physical read" and the object
> was the target load object. Means it was the INSERt which was struggling.
> So we now endup having partial data loaded to the target i.e. around
> ~40-50million loaded to the target. And as it has a primary key , we have
> to delete the ~40-50million rows from the target table and then reload it.
>
> I was thinking if we should do the delete in bulk method with a chunk of
> 1million , but again that will happen in a single thread and will be
> significantly slow. So what is the best way to have those ~40-50million
> data deleted from the target table(which holds a total ~20billion rows)?
>
> And then I was also thinking , if we could make the index unusable and
> perform the delete and then data load(which would happen with almost zero
> UNDo in absence of INDEXes) , but then in that case it seems that DELETE
> will need the PK index to fetch the rowids so we cant get rid of the PK
> index then. So what is the best approach to go for the delete and data
> load here without breaking data consistency in the target table?
>
> So wanted to understand what is the best/faster approach to go for delete
> and data load in this situation?
>
> Regards
> Lok
>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Mar 28 2021 - 07:44:02 CEST

Original text of this message