Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Insert statement taking a very long time to complete

Re: Insert statement taking a very long time to complete

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: 20 Mar 2007 10:51:00 -0700
Message-ID: <1174413060.590685.115430@p15g2000hsd.googlegroups.com>


On Mar 20, 11:16 am, "gor..._at_panix.com" <gor..._at_panix.com> wrote:
> [I recently posted this on .server; I apologize for the double-
> posting.]
>
> I'm having some trouble with an SQL insert statement taking waaay too
> long to run. It's been suggested that maybe I'm using inefficient
> SQL, and that there might be a faster way to achieve the results.
>
> I have two tables with identical structure, and I want to copy all
> records that exist in table A but not in table B, into table B.
>
> Assuming that the structure is primary key "pk" and data fields "f1",
> "f2" and "f3", here is the SQL statement I'm using:
>
> insert into tableb(pk, f1, f2, f3)
> (select * from tablea where tablea.pk not in (select pk from
> tableb))
>
> This statement has worked well for small amounts of data, but when I
> tried on a table with approximately 200k records, it took forever to
> finish.
>
> A coworker with some SQL experience thinks that the "where" clause is
> being executed for *every* insertion. If so, this could certainly
> account for much of the slowdown.
>
> Anyone have any suggestions?
>
> --
> John Gordon

Mark offers several good ideas. Keep in mind that Oracle can silently rewrite your SQL statements, which will affect execution time. A quick example with 10053 and translated 10046 STAT line output (for some reason, the elapsed times reported by the two traces differ - 10046 trace is correct).

CREATE TABLE TA (

  PK NUMBER(12),
  F1 NUMBER(12),
  F2 NUMBER(12),
  F3 NUMBER(12),

  PRIMARY KEY(PK)); CREATE TABLE TB (
  PK NUMBER(12),
  F1 NUMBER(12),
  F2 NUMBER(12),
  F3 NUMBER(12),

  PRIMARY KEY(PK)); INSERT INTO
  TA
SELECT
  ROWNUM*2,
  TRUNC(DBMS_RANDOM.VALUE(1,100))+1,
  TRUNC(DBMS_RANDOM.VALUE(1,100))+1,
  TRUNC(DBMS_RANDOM.VALUE(1,100))+1

FROM
  SOME_BIG_TABLE
WHERE
  ROWNUM<=500000;

500000 rows created.

INSERT INTO
  TB
SELECT
  ROWNUM*5,

  TRUNC(DBMS_RANDOM.VALUE(1,100))+1,
  TRUNC(DBMS_RANDOM.VALUE(1,100))+1,
  TRUNC(DBMS_RANDOM.VALUE(1,100))+1

FROM
  SOME_BIG_TABLE
WHERE
  ROWNUM<=500000;

500000 rows created.

COMMIT; Now that we have the tables with data, gather statistics: EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TABLE_OWNER_HERE',TABNAME=>'TA',CASCADE=>TRUE); EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TABLE_OWNER_HERE',TABNAME=>'TB',CASCADE=>TRUE); Set up for the first test:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD1'; ALTER SESSION SET EVENTS '10046 TRACE NAME CONTEXT FOREVER, LEVEL 8'; ALTER SESSION SET EVENTS '10053 TRACE NAME CONTEXT FOREVER, LEVEL 1'; ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='10.2.0.3'; INSERT INTO
  TB
SELECT

  TA.PK,
  TA.F1,
  TA.F2,
  TA.F3

FROM
  TA,
  TB
WHERE
  TA.PK=TB.PK(+)
  AND TB.PK IS NULL; 400000 rows created.

10053 plan:


+-----------------------------------+
| Id  | Operation              | Name       | Rows  | Bytes | Cost  |
Time      |
--------------------------------------------
+-----------------------------------+
| 0   | INSERT STATEMENT       |            |       |       |  2774
|           |
| 1   |  HASH JOIN RIGHT ANTI  |            |     1 |    19 |  2774 |
00:00:43 |
| 2 | INDEX FAST FULL SCAN | SYS_C007848| 491K | 2947K | 384 | 00:00:06 |
| 3 | TABLE ACCESS FULL | TA | 489K | 6354K | 530 | 00:00:09 |
+-----------------------------------+

Predicate Information:

1 - access("TA"."PK"="TB"."PK")

10046 STAT lines
CPU Time 4.383628,Elapsed Time 4.450739,Rows Affected 400000,Blks from Buff 5881+CU(23455),Blks from Disk 0,Goal=ALL_ROWS   (Rows 400000) HASH JOIN RIGHT ANTI (cr=3793 pr=0 pw=0 time=3345121 us)
  (Rows 500000) INDEX FAST FULL SCAN SYS_C007848 (cr=2468 pr=0 pw=0 time=1500148 us)
  (Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0 time=1512393 us)

ROLLBACK; Set up for test #2:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD2'; INSERT INTO
  TB
SELECT
  *
FROM
  TA
WHERE
  PK NOT IN (
    SELECT
      PK
    FROM
      TB);

400000 rows created.


+-----------------------------------+
| Id  | Operation              | Name       | Rows  | Bytes | Cost  |
Time      |
--------------------------------------------
+-----------------------------------+
| 0   | INSERT STATEMENT       |            |       |       |  2774
|           |
| 1   |  HASH JOIN RIGHT ANTI  |            |     1 |    19 |  2774 |
00:00:43 |
| 2 | INDEX FAST FULL SCAN | SYS_C007848| 491K | 2947K | 384 | 00:00:06 |
| 3 | TABLE ACCESS FULL | TA | 489K | 6354K | 530 | 00:00:09 |
+-----------------------------------+

Predicate Information:

1 - access("PK"="PK")

CPU Time 4.399228,Elapsed Time 4.427972,Rows Affected 400000,Blks from Buff 5904+CU(23503),Blks from Disk 0,Goal=ALL_ROWS   (Rows 400000) HASH JOIN RIGHT ANTI (cr=3793 pr=0 pw=0 time=3745658 us)
  (Rows 500000) INDEX FAST FULL SCAN SYS_C007848 (cr=2468 pr=0 pw=0 time=1500112 us)
  (Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0 time=1511650 us)

ROLLBACK; Note that the execution times and plans are the same - Oracle rewrote the SQL statement.

Set up for test #3, pretend that we are running Oracle 8.1.7: ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD3'; ALTER SESSION SET OPTIMIZER_FEATURES_ENABLE='8.1.7'; ALTER SYSTEM FLUSH SHARED_POOL; INSERT INTO
  TB
SELECT

  TA.PK,
  TA.F1,
  TA.F2,
  TA.F3

FROM
  TA,
  TB
WHERE
  TA.PK=TB.PK(+)
  AND TB.PK IS NULL;
+-----------------------------------+
| Id  | Operation            | Name       | Rows  | Bytes | Cost  |
Time      |
------------------------------------------
+-----------------------------------+
| 0   | INSERT STATEMENT     |            |       |       |  489K
|           |
| 1   |  FILTER              |            |       |       |
|           |
| 2   |   NESTED LOOPS OUTER |            |  489K | 9287K |  489K
|           |
| 3   |    TABLE ACCESS FULL | TA         |  489K | 6354K |   133
|           |
| 4   |    INDEX UNIQUE SCAN | SYS_C007848|     1 |     6 |     1
|           |
------------------------------------------
+-----------------------------------+

Predicate Information:

1 - filter("TB"."PK" IS NULL)
4 - access("TA"."PK"="TB"."PK")

CPU Time 11.372473,Elapsed Time 11.474917,Rows Affected 0,Blks from Buff 1001327,Blks from Disk 0,Goal=ALL_ROWS

       (Rows 0) FILTER (cr=1001327 pr=0 pw=0 time=11474851 us)   (Rows 500000) NESTED LOOPS OUTER (cr=1001327 pr=0 pw=0 time=15500078 us)
  (Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0 time=2000052 us)
  (Rows 500000) INDEX UNIQUE SCAN SYS_C007848 (cr=1000002 pr=0 pw=0 time=8231171 us)

That execution took 3-4 times longer than it did when the optimizer was set to 10.2.0.3. Execution plan looks a bit different too.

ROLLBACK; Set up for test #4:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD4'; INSERT INTO
  TB
SELECT
  *
FROM
  TA
WHERE
  PK NOT IN (
    SELECT
      PK
    FROM
      TB);


+-----------------------------------+
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
Time      |
-----------------------------------------
+-----------------------------------+
| 0   | INSERT STATEMENT    |            |       |       |   133
|           |
| 1   |  FILTER             |            |       |       |
|           |
| 2   |   TABLE ACCESS FULL | TA         |   24K |  318K |   133
|           |
| 3   |   INDEX UNIQUE SCAN | SYS_C007848|     1 |     6 |     2
|           |
-----------------------------------------
+-----------------------------------+

Predicate Information:

1 - filter( IS NULL)
3 - access("PK"=:B1)

CPU Time 13.166484,Elapsed Time 13.281404,Rows Affected 400000,Blks from Buff 1506667+CU(23499),Blks from Disk 0,Goal=ALL_ROWS

  (Rows 400000) FILTER (cr=1504548 pr=0 pw=0 time=11600102 us)   (Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0 time=2000058 us)
  (Rows 100000) INDEX UNIQUE SCAN SYS_C007848 (cr=1503223 pr=0 pw=0 time=6801430 us)

This method took two seconds longer than the other method.

ROLLBACK; Set up for test #5:
ALTER SESSION SET TRACEFILE_IDENTIFIER='TESTTD5'; INSERT INTO
  TB
SELECT
  *
FROM
  TA
WHERE
  NOT EXISTS (
    SELECT
      TB.PK
    FROM
      TB
    WHERE
      TA.PK=TB.PK);


+-----------------------------------+
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
Time      |
-----------------------------------------
+-----------------------------------+
| 0   | INSERT STATEMENT    |            |       |       |   133
|           |
| 1   |  FILTER             |            |       |       |
|           |
| 2   |   TABLE ACCESS FULL | TA         |   24K |  318K |   133
|           |
| 3   |   INDEX UNIQUE SCAN | SYS_C007848|     1 |     6 |     2
|           |
-----------------------------------------
+-----------------------------------+

Predicate Information:

1 - filter( IS NULL)
3 - access("TB"."PK"=:B1)

CPU Time 13.041684,Elapsed Time 13.134901,Rows Affected 400000,Blks from Buff 1506656+CU(23499),Blks from Disk 0,Goal=ALL_ROWS   (Rows 400000) FILTER (cr=1504548 pr=0 pw=0 time=11200115 us)   (Rows 500000) TABLE ACCESS FULL TA (cr=1325 pr=0 pw=0 time=2000064 us)
  (Rows 100000) INDEX UNIQUE SCAN SYS_C007848 (cr=1503223 pr=0 pw=0 time=6757477 us)

Test #4 and #5 executed in about the same amount of time.

I suggest that you try tracing the execution using a 10046 or 10053 trace to determine the fastest execution syntax.

Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc. Received on Tue Mar 20 2007 - 12:51:00 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US