Re: Inserting with billion of rows faster

From: Lok P <loknath.73_at_gmail.com>
Date: Tue, 30 Mar 2021 00:28:51 +0530
Message-ID: <CAKna9VZ8EHO1-sXw4RTsiPqT=D+TipTE+k2sPpo7+n1V-_jMug_at_mail.gmail.com>



  No Its active-active configuration in which replication happens through the golden gate. So yes to answer your question it's not a data guard configured environment.

On Tue, Mar 30, 2021 at 12:12 AM Laurentiu Oprea < laurentiu.oprea06_at_gmail.com> wrote:

> Out of curiosity the database is in a dataguard configuration?
>
> On Mon, Mar 29, 2021, 21:30 Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
>
>> Lok,
>>
>> I am afraid exchange partition will generally hardly help in your case.
>> Of course from the point of manageability a table of this size should be
>> range partitioned by date.
>> In addition I rather wonder about the usefulness of the list partitioning
>> if one partition holds 18billion rows out of total ~20billion.
>> 700 million rows however are not that much. When I was RWPG member I had
>> once the opportunity to load 2 billion rows and a standard hardware. Took
>> 15 minutes.
>> As Tim mentioned, it was likely not a direct path load in your case.
>> Can you shared an execution plan of the insert?
>>
>> Thanks
>>
>> Lothar
>>
>>
>> Am 29.03.2021 um 20:11 schrieb Lok P:
>>
>> Thank you Tim. Actually we have the target table holding ~20billion rows
>> which is list-hash composite partitioned. It has two list partitions and
>> each list partition has a ~2048 hash subpartition. And we want to insert
>> into that target table around ~700+ million rows which is failing with
>> ora-01624.
>>
>> Now I am unable to fully understand how we can utilize partition
>> exchange method here to load those ~700million rows. As because , I see
>> even all of those rows are meant for one of the list partitions(which
>> itself holds ~18billion rows out of total ~20billion) but were spread
>> across ~2048 hash subpartitions. So do you mean to say, we should load
>> ~18billion rows into a similar structure stage table(without index with
>> same list-hash composite partition) and also load those additional
>> ~700million those are missing (and it should succeed as it wont have
>> indexes now so zero UNDO with APPEND INSERT) and then do the truncate of
>> the list partition and then do the partition exchange with stage table?
>>
>> On Mon, Mar 29, 2021 at 2:45 AM Tim Gorman <tim.evdbt_at_gmail.com> wrote:
>>
>>> Lok,
>>>
>>> >> Ora-01628 max extent 32765 reached for rollback segment
>>>
>>> If you are running out of space on your rollback segments, then you are
>>> not actually using direct path insert, APPEND or APPEND_VALUES hint
>>> notwithstanding. The only undo you should be generating is for the DDL,
>>> not the DML, and that should not max out anything.
>>>
>>> Long story short: you really can not effectively perform direct-path
>>> APPEND insert directly to that "live" table or partition with indexes. In
>>> your case, you need to perform the direct-path insert to a hash-partitioned
>>> table of the same logical "shape" as the list partition you are targeting,
>>> then build indexes to match the main table when you are done loading, and
>>> then finally use ALTER TABLE ... EXCHANGE PARTITION to swap it all with the
>>> "live" partition(s) in the main table. If it helps, HERE
>>> <http://evdbt.com/download/presentation-scaling-to-infinity-partitioning-data-warehouses-on-oracle-database/>is
>>> a presentation on this technique and HERE
>>> <http://evdbt.com/download/paper-scaling-to-infinity-partitioning-data-warehouses-on-oracle-database-2/>is
>>> a corresponding white paper on the technique. Also, HERE
>>> <http://evdbt.com/download/exchpart-sql/>is a PL/SQL package called
>>> EXCHPART which contains procedures to automate the moves mentioned in the
>>> white paper and presentation. Please don't consider EXCHPART as anything
>>> more than a template, it is not intended as working code, particularly when
>>> it must be adapted for your local coding culture and standards. Hopefully
>>> it is a good start?
>>>
>>> Oh also, there is no such thing as a NOLOGGING hint; nologging is an
>>> attribute on the table or partition which is effective only with
>>> direct-path insert operations, never with conventional-path INSERT, UPDATE,
>>> or DELETE operations.
>>>
>>> Hope this helps...
>>>
>>> Thanks!
>>>
>>> -Tim
>>>
>>>
>>> On 3/27/2021 10:37 PM, Lok P wrote:
>>>
>>> Hi Listers, we have one process in which we load ~50million daily (using
>>> direct path insert.. INSERT APPEND ) to a target table which holds
>>> ~20billion rows(~1TB in size)) and is list -hash composite partitioned. ~2
>>> list partition and ~2048 hash subpartitions. It has 4 indexes out of those
>>> one is a composite primary key comprising of 5 columns.We have that data
>>> load job failing since ~15-20 days without notice and it thus accumulated
>>> ~billion rows and the load process is now failing with (Ora-01628 max
>>> extent 32765 reached for rollback segment).
>>>
>>> So we thought of running it in a bulk collect method(append_values hint)
>>> and commit in chunks of ~1million. But during that process we endup seeing
>>> the data load is significantly slower , it was inserting ~1million rows in
>>> ~40minutes. And we were not able to understand the reason but the wait
>>> events were all showing "cell single block physical read" and the object
>>> was the target load object. Means it was the INSERt which was struggling.
>>> So we now endup having partial data loaded to the target i.e. around
>>> ~40-50million loaded to the target. And as it has a primary key , we have
>>> to delete the ~40-50million rows from the target table and then reload it.
>>>
>>> I was thinking if we should do the delete in bulk method with a chunk of
>>> 1million , but again that will happen in a single thread and will be
>>> significantly slow. So what is the best way to have those ~40-50million
>>> data deleted from the target table(which holds a total ~20billion rows)?
>>>
>>> And then I was also thinking , if we could make the index unusable and
>>> perform the delete and then data load(which would happen with almost zero
>>> UNDo in absence of INDEXes) , but then in that case it seems that DELETE
>>> will need the PK index to fetch the rowids so we cant get rid of the PK
>>> index then. So what is the best approach to go for the delete and data
>>> load here without breaking data consistency in the target table?
>>>
>>> So wanted to understand what is the best/faster approach to go for
>>> delete and data load in this situation?
>>>
>>> Regards
>>> Lok
>>>
>>>
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Mar 29 2021 - 20:58:51 CEST

Original text of this message