Re: Inserting with billion of rows faster

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Mon, 29 Mar 2021 23:22:36 -0400
Message-ID: <ba062424-d966-5b31-c2c0-05632981cb0b_at_gmail.com>


You need some major hardware to do that, Lots of cores and extremely fast storage, probably XTremIO X2 or Fujitsu Eternus, something that approaches 20.000 IOPS. As much as I hate throwing hardware at problems, ingesting huge quantities of data still requires some major hardware. I am not talking about Exadata. Exadata is a DW machine. Also, you may try loading into a separate table and doing exchange partition.

On 3/28/21 1:37 AM, Lok P wrote:
> Hi Listers, we have one process in which we load ~50million daily
> (using direct path insert.. INSERT APPEND ) to a target table which
> holds ~20billion rows(~1TB in size)) and is list -hash composite
> partitioned. ~2 list partition and ~2048 hash subpartitions. It has 4
> indexes out of those one is a composite primary key comprising of 5
> columns.We have that data load job failing since ~15-20 days without
> notice and  it thus accumulated ~billion rows and the load process is
> now failing with (Ora-01628 max extent 32765 reached for rollback
> segment).
>
> So we thought of running it in a bulk collect method(append_values
> hint) and commit in chunks of ~1million. But during that process we
> endup seeing the data load is significantly slower , it was inserting
> ~1million rows in ~40minutes. And we were not able to understand the
> reason but the wait events were all showing "cell single block
> physical read" and the object was the target load object. Means it was
> the INSERt which was struggling.  So we now endup having partial data
> loaded to the target i.e. around ~40-50million loaded to the target.
> And as it has a primary key , we have to delete the ~40-50million rows
> from the target table and then reload it.
>
> I was thinking if we should do the delete in bulk method with a chunk
> of 1million , but again that will happen in a single thread and will
> be significantly slow. So what is the best way to have those
> ~40-50million data deleted from the target table(which holds a total
> ~20billion rows)?
>
> And then I was also thinking , if we could make the index unusable and
> perform the delete and then data load(which would happen with almost
> zero UNDo in absence of INDEXes) , but then in that case it seems that
> DELETE will need the PK index to fetch the rowids so we cant get rid
> of the PK index then. So what is the best approach to go for 
> the delete and data load here without breaking data consistency in the
> target table?
>
> So wanted to understand what is the best/faster approach to go for
> delete and data load in this situation?
>
> Regards
> Lok

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Mar 30 2021 - 05:22:36 CEST

Original text of this message