Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Re: Insert statement taking a very long time to complete
On Mar 20, 11:16 am, "gor..._at_panix.com" <gor..._at_panix.com> wrote:
> [I recently posted this on .server; I apologize for the double-
> posting.]
>
> I'm having some trouble with an SQL insert statement taking waaay too
> long to run. It's been suggested that maybe I'm using inefficient
> SQL, and that there might be a faster way to achieve the results.
>
> I have two tables with identical structure, and I want to copy all
> records that exist in table A but not in table B, into table B.
>
> Assuming that the structure is primary key "pk" and data fields "f1",
> "f2" and "f3", here is the SQL statement I'm using:
>
> insert into tableb(pk, f1, f2, f3)
> (select * from tablea where tablea.pk not in (select pk from
> tableb))
>
> This statement has worked well for small amounts of data, but when I
> tried on a table with approximately 200k records, it took forever to
> finish.
>
> A coworker with some SQL experience thinks that the "where" clause is
> being executed for *every* insertion. If so, this could certainly
> account for much of the slowdown.
>
> Anyone have any suggestions?
>
> --
> John Gordon
Mark offers several good ideas. Keep in mind that Oracle can silently rewrite your SQL statements, which will affect execution time. A quick example with 10053 and translated 10046 STAT line output (for some reason, the elapsed times reported by the two traces differ - 10046 trace is correct).
CREATE TABLE TA (
PK NUMBER(12), F1 NUMBER(12), F2 NUMBER(12), F3 NUMBER(12),
PK NUMBER(12), F1 NUMBER(12), F2 NUMBER(12), F3 NUMBER(12),
TRUNC(DBMS_RANDOM.VALUE(1,100))+1, TRUNC(DBMS_RANDOM.VALUE(1,100))+1, TRUNC(DBMS_RANDOM.VALUE(1,100))+1
500000 rows created.
INSERT INTO
TB
SELECT
ROWNUM*5,
TRUNC(DBMS_RANDOM.VALUE(1,100))+1, TRUNC(DBMS_RANDOM.VALUE(1,100))+1, TRUNC(DBMS_RANDOM.VALUE(1,100))+1
500000 rows created.
COMMIT;
Now that we have the tables with data, gather statistics:
EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TABLE_OWNER_HERE',TABNAME=>'TA',CASCADE=>TRUE);
EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TABLE_OWNER_HERE',TABNAME=>'TB',CASCADE=>TRUE);
Set up for the first test:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD1';
ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8';
ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1';
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='10.2.0.3';
INSERT INTO
TB
SELECT
TA.PK, TA.F1, TA.F2, TA.F3
10053 plan:
+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------- +-----------------------------------+ | 0 | INSERT STATEMENT | | | | 2774 | | | 1 | HASH JOIN RIGHT ANTI | | 1 | 19 | 2774 |00:00:43 |
+-----------------------------------+
10046 STAT lines
CPU Time 4.383628,Elapsed Time 4.450739,Rows Affected 400000,Blks from
Buff 5881+CU(23455),Blks from Disk 0,Goal=ALL_ROWS
(Rows 400000) HASH JOIN RIGHT ANTI (cr=3793 pr=0 pw=0 time=3345121
us)
(Rows 500000) INDEX FAST FULL SCAN SYS_C007848 (cr=2468 pr=0 pw=0
time=1500148 us)
(Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0
time=1512393 us)
ROLLBACK;
Set up for test #2:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD2';
INSERT INTO
TB
SELECT
*
FROM
TA
WHERE
PK NOT IN (
SELECT
PK
FROM
TB);
400000 rows created.
+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------- +-----------------------------------+ | 0 | INSERT STATEMENT | | | | 2774 | | | 1 | HASH JOIN RIGHT ANTI | | 1 | 19 | 2774 |00:00:43 |
+-----------------------------------+
CPU Time 4.399228,Elapsed Time 4.427972,Rows Affected 400000,Blks from
Buff 5904+CU(23503),Blks from Disk 0,Goal=ALL_ROWS
(Rows 400000) HASH JOIN RIGHT ANTI (cr=3793 pr=0 pw=0 time=3745658
us)
(Rows 500000) INDEX FAST FULL SCAN SYS_C007848 (cr=2468 pr=0 pw=0
time=1500112 us)
(Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0
time=1511650 us)
ROLLBACK; Note that the execution times and plans are the same - Oracle rewrote the SQL statement.
Set up for test #3, pretend that we are running Oracle 8.1.7:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD3';
ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='8.1.7';
ALTER SYSTEM FLUSH SHARED_POOL;
INSERT INTO
TB
SELECT
TA.PK, TA.F1, TA.F2, TA.F3
+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ------------------------------------------ +-----------------------------------+ | 0 | INSERT STATEMENT | | | | 489K | | | 1 | FILTER | | | | | | | 2 | NESTED LOOPS OUTER | | 489K | 9287K | 489K | | | 3 | TABLE ACCESS FULL | TA | 489K | 6354K | 133 | | | 4 | INDEX UNIQUE SCAN | SYS_C007848| 1 | 6 | 1 | | ------------------------------------------ +-----------------------------------+
CPU Time 11.372473,Elapsed Time 11.474917,Rows Affected 0,Blks from Buff 1001327,Blks from Disk 0,Goal=ALL_ROWS
(Rows 0) FILTER (cr=1001327 pr=0 pw=0 time=11474851 us)
(Rows 500000) NESTED LOOPS OUTER (cr=1001327 pr=0 pw=0
time=15500078 us)
(Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0
time=2000052 us)
(Rows 500000) INDEX UNIQUE SCAN SYS_C007848 (cr=1000002 pr=0
pw=0 time=8231171 us)
That execution took 3-4 times longer than it did when the optimizer was set to 10.2.0.3. Execution plan looks a bit different too.
ROLLBACK;
Set up for test #4:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD4';
INSERT INTO
TB
SELECT
*
FROM
TA
WHERE
PK NOT IN (
SELECT
PK
FROM
TB);
+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------- +-----------------------------------+ | 0 | INSERT STATEMENT | | | | 133 | | | 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | TA | 24K | 318K | 133 | | | 3 | INDEX UNIQUE SCAN | SYS_C007848| 1 | 6 | 2 | | ----------------------------------------- +-----------------------------------+
CPU Time 13.166484,Elapsed Time 13.281404,Rows Affected 400000,Blks from Buff 1506667+CU(23499),Blks from Disk 0,Goal=ALL_ROWS
(Rows 400000) FILTER (cr=1504548 pr=0 pw=0 time=11600102 us)
(Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0
time=2000058 us)
(Rows 100000) INDEX UNIQUE SCAN SYS_C007848 (cr=1503223 pr=0 pw=0
time=6801430 us)
This method took two seconds longer than the other method.
ROLLBACK;
Set up for test #5:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD5';
INSERT INTO
TB
SELECT
*
FROM
TA
WHERE
NOT EXISTS (
SELECT
TB.PK
FROM
TB
WHERE
TA.PK=TB.PK);
+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------- +-----------------------------------+ | 0 | INSERT STATEMENT | | | | 133 | | | 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | TA | 24K | 318K | 133 | | | 3 | INDEX UNIQUE SCAN | SYS_C007848| 1 | 6 | 2 | | ----------------------------------------- +-----------------------------------+
CPU Time 13.041684,Elapsed Time 13.134901,Rows Affected 400000,Blks
from Buff 1506656+CU(23499),Blks from Disk 0,Goal=ALL_ROWS
(Rows 400000) FILTER (cr=1504548 pr=0 pw=0 time=11200115 us)
(Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0
time=2000064 us)
(Rows 100000) INDEX UNIQUE SCAN SYS_C007848 (cr=1503223 pr=0 pw=0
time=6757477 us)
Test #4 and #5 executed in about the same amount of time.
I suggest that you try tracing the execution using a 10046 or 10053 trace to determine the fastest execution syntax.
Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc.
Received on Tue Mar 20 2007 - 12:51:00 CDT