Re: Inserting with billion of rows faster

From: Lok P <loknath.73_at_gmail.com>
Date: Sat, 3 Apr 2021 23:52:43 +0530
Message-ID: <CAKna9VajV_a53KAgOvLoV5GPuDo-uP01hPC03anhuKa9_Mvh+g_at_mail.gmail.com>



Thank you Jonathan.

Yes initial requirement was to fix data issues by moving ~800million rows to the target table at one shot. But we were hitting Ora-01628 max extent 32765 reached for rollback segment. But as you suggested , that went fine after shrinking the specific rollback segments that were having a lot many small extents and were reaching close to ~32K limit.

But this is out of curiosity as because, through this process , we were almost taking a decision to build the stage table( without indexes) which will hold all data from the target table partition (i.e. ~18billion rows or ~1TB+) and then dump additional/missing ~800million rows to that and finally do a partition exchange. And for that, few of team mates were suggesting to populate the stage table in chunks using something like "Insert /*+append*/ Into.. select .. from main table where create_date between sysdate-365 and sysdate-335" and keep looping till sysdate. Their point was moving such large data(~1TB+) in one shot either using INSERT APPEND or CTAS may break things. So I was a bit confused because in either case the UNDO will be almost Zero, so doing it at one shot or in chunks should not make any difference irrespective of size of data. Correct me if wrong.

Regards
Lok

On Sat, Apr 3, 2021 at 11:20 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

>
> I thought your requirement was to insert hundreds of millions of rows per
> day into an existing table. Now you're talking about creating 18 billion
> rows as a CTAS.
> Was the original request because you had decided to create the table bit
> by bit at a few hundred million rows per day? Did you start off by telling
> us you wanted to do this - did I miss that?
>
>
> If your options are:
> a) create table as select,, create indexes
> b) create empty table, insert append, create indexes
> then there shouldn't be much difference between the two BUT
>
> a) You're on 11.2.0.4: it's probably past the point where CTAS and insert
> as append were inconsistent, but there were versions where their arithmetic
> was slightly different and one could get an extra row per block: so I would
> test that with (say) 10M rows.
>
> b) You're using Exadata, and there have been various tweaks and patches to
> handle chained rows (especially intra block chained rows due to >255
> columns). If this might apply to you I'd test that carefully to see what
> happens, as there was a trade-off between saving space (allowing
> multi-piece rows to chain across blocks) and reducing query workload
> (forcing all the pieces of a row to be in the same block and potentially
> wasting space).
>
> c) If you do "insert append" and the target a partitioned table, you may
> be able to use dbms_parallel_execute to define a set of tasks such that
> each task selects only the data that will go into a specific partition.
> This will then allow you to run as many concurrent tasks as the machine
> will bear (a) they won't be contending on space and (b) you won't have to
> have two sets of parallel slaves distributing rows by partition key.
>
> Regards
> Jonathan Lewis
>
>
>
>
> On Sat, 3 Apr 2021 at 12:38, Lok P <loknath.73_at_gmail.com> wrote:
>
>> Basically my thought was, In a normal data load scenario where we have to
>> move TB's of data(say 1TB+), considering post data load indexing will take
>> the same amount of time in any case. Whether it's suggested to go for
>> INSERT APPEND method of data load in chunks OR doing one shot data load
>> using CTAS? In both the cases there will be almost zero UNDO i, so i was
>> wondering if any limit is there for one shot dataload, so that we should
>> not go for CTAS approach after a certain size of data but have to do that
>> in chunks only(may be using INSERT.. APPEND method only)?
>>
>> Regards
>> Lok
>>
>> On Thu, Apr 1, 2021 at 6:49 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Thanks much, Jonathan. It worked after shrinking the top segments.
>>> Thanks again.
>>>
>>> Out of curiosity , and as we were almost end up deciding to attempt this
>>> one time load, so considering ~18billion rows( worth ~1TB+ size ) to be
>>> moved at one shot , whether it's sensible to go for "Insert append into
>>> stage table ... select.. from source table" approach OR "CTAS stage
>>> table.. select from source table"? I believe in both cases UNDO will be
>>> almost Zero as no index is there in them, so wondering what other DB
>>> resource it will consume?
>>>
>>> On Thu, Apr 1, 2021 at 6:19 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Thank you so much Jonathan. As you rightly mentioned , the Shrink of
>>>> specific rollback segment seems to be the best work around to get away in
>>>> this situation. i'm going to try it now. Thanks.
>>>>
>>>> On Thu, Apr 1, 2021 at 5:07 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> Follow-up to my previous comments about dropping or affecting the size
>>>>> of undo segments. Read MOS nod 1580182.1
>>>>> The basic comment is that you can hit this problem if you have very
>>>>> long running transactions and a very large undo retention - and it probably
>>>>> needs a reasonable level of concurrency as well, but that's not stressed.
>>>>>
>>>>> It is possible to set a debug parameter that allows you to shrink undo
>>>>> segments by name - details in the note.
>>>>> It seems to work.
>>>>>
>>>>>
>>>>> Regards
>>>>> Jonathan Lewis
>>>>>
>>>>>
>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Sat Apr 03 2021 - 20:22:43 CEST

Original text of this message