Re: Inserting with billion of rows faster

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Sat, 3 Apr 2021 18:50:03 +0100
Message-ID: <CAGtsp8n+ngyZkE1ikpXbTbQRSfONnxnp0YKGTuqca=bfeW0n_w_at_mail.gmail.com>

I thought your requirement was to insert hundreds of millions of rows per day into an existing table. Now you're talking about creating 18 billion rows as a CTAS.
Was the original request because you had decided to create the table bit by bit at a few hundred million rows per day? Did you start off by telling us you wanted to do this - did I miss that?

If your options are:
a) create table as select,, create indexes b) create empty table, insert append, create indexes then there shouldn't be much difference between the two BUT

You're on 11.2.0.4: it's probably past the point where CTAS and insert as append were inconsistent, but there were versions where their arithmetic was slightly different and one could get an extra row per block: so I would test that with (say) 10M rows.

You're using Exadata, and there have been various tweaks and patches to handle chained rows (especially intra block chained rows due to >255 columns). If this might apply to you I'd test that carefully to see what happens, as there was a trade-off between saving space (allowing multi-piece rows to chain across blocks) and reducing query workload (forcing all the pieces of a row to be in the same block and potentially wasting space).

If you do "insert append" and the target a partitioned table, you may be able to use dbms_parallel_execute to define a set of tasks such that each task selects only the data that will go into a specific partition. This will then allow you to run as many concurrent tasks as the machine will bear (a) they won't be contending on space and (b) you won't have to have two sets of parallel slaves distributing rows by partition key.

Regards
Jonathan Lewis

On Sat, 3 Apr 2021 at 12:38, Lok P <loknath.73_at_gmail.com> wrote:

> Basically my thought was, In a normal data load scenario where we have to
> move TB's of data(say 1TB+), considering post data load indexing will take
> the same amount of time in any case. Whether it's suggested to go for
> INSERT APPEND method of data load in chunks OR doing one shot data load
> using CTAS? In both the cases there will be almost zero UNDO i, so i was
> wondering if any limit is there for one shot dataload, so that we should
> not go for CTAS approach after a certain size of data but have to do that
> in chunks only(may be using INSERT.. APPEND method only)?
>
> Regards
> Lok
>
> On Thu, Apr 1, 2021 at 6:49 PM Lok P <loknath.73_at_gmail.com> wrote:
>
>> Thanks much, Jonathan. It worked after shrinking the top segments. Thanks
>> again.
>>
>> Out of curiosity , and as we were almost end up deciding to attempt this
>> one time load, so considering ~18billion rows( worth ~1TB+ size ) to be
>> moved at one shot , whether it's sensible to go for "Insert append into
>> stage table ... select.. from source table" approach OR "CTAS stage
>> table.. select from source table"? I believe in both cases UNDO will be
>> almost Zero as no index is there in them, so wondering what other DB
>> resource it will consume?
>>
>> On Thu, Apr 1, 2021 at 6:19 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Thank you so much Jonathan. As you rightly mentioned , the Shrink of
>>> specific rollback segment seems to be the best work around to get away in
>>> this situation. i'm going to try it now. Thanks.
>>>
>>> On Thu, Apr 1, 2021 at 5:07 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> Follow-up to my previous comments about dropping or affecting the size
>>>> of undo segments. Read MOS nod 1580182.1
>>>> The basic comment is that you can hit this problem if you have very
>>>> long running transactions and a very large undo retention - and it probably
>>>> needs a reasonable level of concurrency as well, but that's not stressed.
>>>>
>>>> It is possible to set a debug parameter that allows you to shrink undo
>>>> segments by name - details in the note.
>>>> It seems to work.
>>>>
>>>>
>>>> Regards
>>>> Jonathan Lewis
>>>>
>>>>
>>>>
>>>>

--
http://www.freelists.org/webpage/oracle-l

Received on Sat Apr 03 2021 - 19:50:03 CEST

This message: [ Message body ]
Next message: Lok P: "Re: Inserting with billion of rows faster"
In reply to: Lok P: "Re: Inserting with billion of rows faster"
In reply to Mark W. Farnham: "RE: Inserting with billion of rows faster"
Next in thread: Lok P: "Re: Inserting with billion of rows faster"
Reply: Lok P: "Re: Inserting with billion of rows faster"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message