RE: Optimisation of mass INSERTs

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Sun, 8 Nov 2020 08:32:38 -0500
Message-ID: <34e101d6b5d3$a2acd150$e80673f0$_at_rsiz.com>


I'm curious whether decoupling the transaction data from the insert operation is possible to you by identifying a temporary file at the database server from which to insert batches.

Apart from that, if your insert destination has indexes and constraints, there *MAY* be a batch size into a just the data (possibly GTT, if you don't need multi-session persistence) table such that doing your remote inserts there followed by batch insert append from sweeps provide the optimal solution.

That may raise complications versus individual row insertions in handling errors, but you're already doing batches, so I suspect you have that handled already.

Good luck. Dribbling data in has always been difficult across the network.

For a *long* time the old sqlplus "copy" command handled all the optimization with one or two parameters, but that does NOT support all the more modern data types last time I looked. Of course that begs the question of where the table you are copying would be, which begs the question of a local minimal feature Oracle database on the client. If the source of the inserts data is one or a manageable number of servers acting in the client role for these inserts, that would be less of a problem than, say, 10 million web clients you've possibly never seen before.

I do suspect an topography change to the way this is being handled has a *CHANCE* for order of magnitude scale improvement versus nibbling at the edges for marginal improvements. But I also understand that change may not be a reasonable option.

In which case, I think JL probably already covered all the best leads.

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Anton Shepelev
Sent: Friday, November 06, 2020 4:45 PM
To: oracle-l_at_freelists.org
Subject: Optimisation of mass INSERTs

Hello, all.

I am working on a program that should effect transactional upload of massive amounts of data into Oracle tables using managed ODP.NET driver. Upon initial experiments, I have settled on transferring data in a series of "batches", each of which is a parametrised INSERT ALL statement of the form:

INSERT ALL
   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)    VALUES( :val_0_0, :val_0_1, :val_0_2, :val_0_3, :val_0_4 )

   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)    VALUES( :val_1_0, :val_1_1, :val_1_2, :val_1_3, :val_1_4 )

   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)    VALUES( :val_2_0, :val_2_1, :val_2_2, :val_2_3, :val_2_4 )

   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)    VALUES( :val_3_0, :val_3_1, :val_3_2, :val_3_3, :val_3_4 ) SELECT * FROM DUAL with a customisable number of rows. My tests show that it is at least 1.5 times more efficient than a batch of equivalent INSERT statements inside a BEGIN..END block. The performance of this method on the client's site manifests two unusual tendencies, which I can't explain. I suppose them connected somehow with the length of the network route between client and server, which takes 36 ms to ping, but am still at a loss as to the exact explanation.

The first anomaly is the effect of batch size. For each target table and dataset, there appears to be a single optimal number of rows in a batch, so that both smaller and larger batches work considerably slower. Performace measurements for two tables with various batch sizes are shown below, the respective optimums put in brackets (view in a fixed-width font):

                Table A             Table B

+----------------+ +----------------+
|Rows: 100 000 | |Rows: 6000 | |Cols: 20 | |Cols: 60 |
+----------------+ +----------------+
| batch time, s | | batch time, s | | 16 252 | | 8 54 | | 64 65 | | 16 36 | | 256 30 | | [32] 18 | | [384] 26 | | 64 19 | | 512 31 | | 128 43 | | 1024 90 | | 256 318 |
+----------------+ +----------------+

I can understand why smaller batches are inefficent: it must be due to the overhead of frequent call-and-response between client and server, but why performace starts to degrade again as batch size increases past the optimum?

The second anomaly occurs when the network becomes unstable. During these intervals, which occur once or twice a week, `ping' shows that a considerable portion of packets are lost and the percentage of lost packets grows with packet size.
At these times, the general pattern with a single optimum remains, but the optimal batch size is much lower than when the network is feeling well. Again, I have no idea what may be causing this.

I conjecture that it is a case of leaky abstractions[1]: since the ODP.NET driver cannot entirely abstract the ADO.NET interface from the gritty details of network data transfer, batch size affects both the size and the number of network packets required to send one batch, which in turn has a tremendous effect on throughput. But what can I do about it? If my conjecture be right, I have to optimise the serialised size of the batch, rather than the number of rows in it, and I cannot reliably to that because the serialisation algorithm is not officially documented.

I have considered prepared statments, which should save about 60% of traffic, but Oracle have chosen not to implement the .Prepare() method provided by ADO.NET. The official documentation describes the implementation as "noop":

   ODP.NET: OracleCommand Public Methods:  

https://docs.oracle.com/database/121/ODPNT/OracleCommandClass.htm#g1099930

It is noteworthy that JDBC, on the contrary, does support parametrised statements:

   JDBC: Using Prepared Statements
   https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html

In spite of what some experts say about their obsolescense due to improved server-side caching of compiled statements, prepared statements remain superior because they save network traffic by letting the client send the statement id instead of the entire SQL text. I have not yet tried preparing them manually via PL/SQL as descrbied here:

   EXECUTE and PREPARE:
   https://docs.oracle.com/cd/B10501_01/appdev.920/a97269/pc_13dyn.htm#865

but in any case it can only improve performace by a constant factor, but cannot remove the need of optimising batch size for each table, data sturcutre, and network condition.

I have also tried a dynamic optimiser that measured performance in real-time and adjusted batch size accordingly, but it was not very efficient because I do not know how to predict the serialised batch size, in terms of bytes actually sent over the network.

While reading performace-related documention, I came upon the TCP.NODELAY setting:

   TCP.NODELAY   https://docs.oracle.com/cd/E18283_01/network.112/e10835/sqlnet.htm#CIHCAJGA

Is that article correct in that it is `yes' by default? It should seem strange, because all the commentor about this setting discuss enabling it as very non-standard, e.g. Don Burleson writes:

,----[http://www.dba-oracle.com/art_builder_tns.htm:]

| Oracle Net, by default, waits until the buffer is filled before 
| transmitting data. Therefore, requests aren't always sent immediately 
| to their destinations. This is most common when large amounts of data 
| are streamed from one end to another, and Oracle Net does not transmit 
| the packet until the buffer is full.
`---------------------------------

Who is correct, and how can I test the actual default value? Do you think TCP.NODELAY could optimise massive data transfers? I am asking this because I can't easily test at client's site because of bureaucracy, while local testing is nearly useless in our fast local network.

Antoher article, about TCP/IP optimisation in general, mentions the TCP_ACK setting in connexion with TCP_NODELAY:

   Best Practices for TCP Optimization in 2019:  

https://www.extrahop.com/company/blog/2016/tcp-nodelay-nagle-quickack-best-p ractices/

Can it, in your opinion, affect massive data transfers, and if it can, then how to set it for Oracle connections?

For your reference, I already asked this question on DBA StackExchange, but it drew no replies, and does not seem to be going to:

   Anomalous performance of mass INSERT in Oracle:  

https://dba.stackexchange.com/questions/278185/anomalous-performance-of-mass -insert-in-oracle

Can you please help me diagnose and explain these anomalies, and possibly solve the problem of unstable peformance of mass insertions in Oracle? What monitoring tools can I use to determine the bottleneck?


  1. https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/
--
()  ascii ribbon campaign -- against html e-mail /\
http://preview.tinyurl.com/qcy6mjc [archived]
--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l
Received on Sun Nov 08 2020 - 14:32:38 CET

Original text of this message