Home » SQL & PL/SQL » SQL & PL/SQL » joining huge tables
joining huge tables [message #203277] Tue, 14 November 2006 10:06 Go to next message
hgha
Messages: 3
Registered: November 2006
Junior Member
Hello,

I have two huge tables that I want to join and retrieve a column from each and use these columns to insert a record into a third relatively small (400k Rec). It takes several
hours to be completed which is not acceptable for the time frame that I have. in the following example EITEM has 60 million and EOP has 30 million records. EITEM.EOP_ID and EOP.EOP_ID are indexed. Even without the NOT IN clause db does a lot to accomplish this join. Is there anything that I can do to speed up this query?


INSERT INTO ITEMSTORE (ITEM_CODE, PLANT_ID)
SELECT DISTINCT EITEM.ITEM_CODE, EOP.PLANT_ID FROM EITEM, EOP
WHERE EOP.EOP_ID = EITEM.EOP_ID
AND (EOP.PLANT_ID, EITEM.ITEM_CODE)
NOT IN (SELECT IP.PLANT_ID, IP.ITEM_CODE FROM ITEMSTORE IP);
Re: joining huge tables [message #203301 is a reply to message #203277] Tue, 14 November 2006 12:22 Go to previous messageGo to next message
Littlefoot
Messages: 20901
Registered: June 2005
Location: Croatia, Europe
Senior Member
Account Moderator
You could try with something like this; I really wouldn't know will it run any faster than your current query (actually, it may be slower).

Could you try both solutions using SET AUTOTRACE ON EXPLAIN and compare execution plans?
INSERT INTO itemstore
            (item_code, plant_id)
   SELECT DISTINCT eitem.item_code, eop.plant_id
              FROM eitem, eop
             WHERE eop.eop_id = eitem.eop_id
   MINUS
   SELECT item_code, plant_id
     FROM itemstore;
Re: joining huge tables [message #203307 is a reply to message #203301] Tue, 14 November 2006 13:11 Go to previous messageGo to next message
hgha
Messages: 3
Registered: November 2006
Junior Member
Hello,

thank you for your response. I did compare the plans and your
query . almost the same. The main problem is the select
statement and the join

SELECT DISTINCT eitem.item_code, eop.plant_id
FROM eitem, eop
WHERE eop.eop_id = eitem.eop_id


SQL> INSERT INTO ITEMSTORE
2 (ITEM_CODE, PLANT_ID)
3 SELECT DISTINCT EITEM.ITEM_CODE, EOP.PLANT_ID
4 FROM EITEM, EOP
5 WHERE EOP.EOP_ID = EITEM.EOP_ID
6 MINUS
7 SELECT ITEM_CODE, PLANT_ID
8 FROM ITEMSTORE;

0 ROWS CREATED.


EXECUTION PLAN
----------------------------------------------------------
PLAN HASH VALUE: 996802580

----------------------------------------------------------------------------------------------------
| ID | OPERATION | NAME | ROWS | BYTES |TEMPSPC| COST (%CPU)| TIME |
----------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 514 | 431K| | 7267 (4)| 00:01:28 |
| 1 | MINUS | | | | | | |
| 2 | SORT UNIQUE | | 514 | 17476 | | 7118 (2)| 00:01:26 |
|* 3 | HASH JOIN | | 783K| 25M| 7640K| 7074 (1)| 00:01:25 |
| 4 | VIEW | INDEX$_JOIN$_003 | 390K| 3053K| | 3989 (1)| 00:00:48 |
|* 5 | HASH JOIN | | | | | | |
| 6 | INDEX FAST FULL SCAN| EOP_IDX10 | 390K| 3053K| | 1082 (1)| 00:00:13 |
| 7 | INDEX FAST FULL SCAN| EOP_IDX0 | 390K| 3053K| | 1479 (1)| 00:00:18 |
| 8 | INDEX FAST FULL SCAN | EITEM_IDX1 | 766K| 19M| | 1064 (1)| 00:00:13 |
| 9 | SORT UNIQUE | | 17703 | 414K| 1128K| 149 (2)| 00:00:02 |
| 10 | INDEX FAST FULL SCAN | ITEMSTORE_IDX0 | 17703 | 414K| | 20 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

PREDICATE INFORMATION (IDENTIFIED BY OPERATION ID):
---------------------------------------------------

3 - ACCESS("EOP"."EOP_ID"="EITEM"."EOP_ID")
5 - ACCESS(ROWID=ROWID)


STATISTICS
----------------------------------------------------------
52 RECURSIVE CALLS
0 DB BLOCK GETS
8118 CONSISTENT GETS
6386 PHYSICAL READS
0 REDO SIZE
823 BYTES SENT VIA SQL*NET TO CLIENT
1051 BYTES RECEIVED VIA SQL*NET FROM CLIENT
6 SQL*NET ROUNDTRIPS TO/FROM CLIENT
3 SORTS (MEMORY)
0 SORTS (DISK)
0 ROWS PROCESSED

SQL> INSERT INTO ITEMSTORE (ITEM_CODE, PLANT_ID)
3 SELECT DISTINCT EITEM.ITEM_CODE, EOP.PLANT_ID FROM EITEM, EOP
4 WHERE EOP.EOP_ID=EITEM.EOP_ID
5 AND (EOP.PLANT_ID, EITEM.ITEM_CODE)
6 NOT IN (SELECT IP.PLANT_ID, IP.ITEM_CODE FROM ITEMSTORE IP);

0 ROWS CREATED.


EXECUTION PLAN
----------------------------------------------------------
PLAN HASH VALUE: 2274046690

----------------------------------------------------------------------------------------------------
| ID | OPERATION | NAME | ROWS | BYTES |TEMPSPC| COST (%CPU)| TIME |
----------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 58 | | 7471 (6)| 00:01:30 |
| 1 | SORT UNIQUE | | 1 | 58 | | 7471 (6)| 00:01:30 |
| 2 | NESTED LOOPS ANTI | | 1 | 58 | | 7470 (6)| 00:01:30 |
|* 3 | HASH JOIN | | 783K| 25M| 7640K| 7074 (1)| 00:01:25 |
| 4 | VIEW | INDEX$_JOIN$_003 | 390K| 3053K| | 3989 (1)| 00:00:48 |
|* 5 | HASH JOIN | | | | | | |
| 6 | INDEX FAST FULL SCAN| EOP_IDX10 | 390K| 3053K| | 1082 (1)| 00:00:13 |
| 7 | INDEX FAST FULL SCAN| EOP_IDX0 | 390K| 3053K| | 1479 (1)| 00:00:18 |
| 8 | INDEX FAST FULL SCAN | EITEM_IDX1 | 766K| 19M| | 1064 (1)| 00:00:13 |
|* 9 | INDEX UNIQUE SCAN | ITEMSTORE_IDX0 | 17703 | 414K| | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

PREDICATE INFORMATION (IDENTIFIED BY OPERATION ID):
---------------------------------------------------

3 - ACCESS("EOP"."EOP_ID"="EITEM"."EOP_ID")
5 - ACCESS(ROWID=ROWID)
9 - ACCESS("EOP"."PLANT_ID"="IP"."PLANT_ID" AND
"EITEM"."ITEM_CODE"="IP"."ITEM_CODE")


STATISTICS
----------------------------------------------------------
218 RECURSIVE CALLS
0 DB BLOCK GETS
41585 CONSISTENT GETS
5910 PHYSICAL READS
0 REDO SIZE
824 BYTES SENT VIA SQL*NET TO CLIENT
1148 BYTES RECEIVED VIA SQL*NET FROM CLIENT
6 SQL*NET ROUNDTRIPS TO/FROM CLIENT
2 SORTS (MEMORY)
0 SORTS (DISK)
0 ROWS PROCESSED
Re: joining huge tables [message #203350 is a reply to message #203307] Tue, 14 November 2006 19:43 Go to previous messageGo to next message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
I assume EOP_ID is unique in the EOP table, right? If not, then the join will be a mini-cartesian.

Other ideas:
  1. You could add PLANT_ID to the EOP_ID index, that way you wouldn't have to join the two indexes (steps 6 and 7 in your plan below).
  2. You could hash-partition the two tables on EOP_ID, and take advantage of partition-wise joins.
  3. You could use a FAST REFRESH materialized view to represent the join of the two tables (without the NOT IN, I don't think you can do that with FAST REFRESH).
  4. You could store both tables in an INDEX CLUSTER.


The last three options will affecte the performance of your database in other ways. Don't even consider doing them until you discuss with your DBA.

Ross Leishman
Re: joining huge tables [message #203603 is a reply to message #203350] Wed, 15 November 2006 14:07 Go to previous message
hgha
Messages: 3
Registered: November 2006
Junior Member
The last three options are not applicable for us but I tested the index and it imporoves a lot. I will create that index first and drop it after use. Thank you for the suggestion.

hgha
Previous Topic: case or if clause in where statement
Next Topic: Use of TEMP segment
Goto Forum:
  


Current Time: Fri Dec 09 14:01:17 CST 2016

Total time taken to generate the page: 0.07747 seconds