Re: Bulk DML question ...

From: Karen <karen.abgarian_at_fmr.com>
Date: Thu, 09 Nov 2000 10:10:59 -0500
Message-ID: <3A0ABE83.EAEAD4D9@fmr.com>

The temp table the query is populating could be populated from a trigger on TABLE1. The trigger could maintain sums of columns for all different groups. The need to join the tables could be placed on that trigger which could use an index to look up individual GROUP_CODEs.

I realize that it may not be feasible even to create triggers on that big table due to insert performance requirements. Of couse, I dont say you should give up on trying make the query run faster. The thing is it may not be possilble to speed it up.

This is my 2 cents...
Karen

sergey_s_at_my-deja.com wrote:

> Let's say I have two tables:
>
> CREATE TABLE1
> (
> TIME_KEY NUMBER,
> ID NUMBER,
> NAME VARCHAR2(20),
> AMOUNT NUMBER(8,2),
> COUNT NUMBER,
> ...
> );
>
> CREATE TABLE2
> (
> ID NUMBER,
> GROUP_CODE NUMBER,
> ...
> );
>
> TABLE1 has approx. 3 billion rows (almost 200 Gig.). TABLE2 has approx.
> 3 million rows. TABLE1 is partitioned by TIME_KEY (partitions have
> anywhere from 50 to 100 million rows each). I looked into making
> partitions smaller based on additional criteria, but the next candidate
> column has only about 500 unique values.
>
> Every row in TABLE1 has a match in TABLE2. In other words, I have no
> choice but to do full table scans. I already looked into business
> process, and there is no time or money to change it at this point.
>
> So, here's the query I need to run:
>
> CREATE TABLE TEMP_TABLE PARALLEL (DEGREE 8 INSTANCES 1) NOLOGGING AS
> SELECT /*+ ORDERED FULL(A) FULL(B) USE_HASH(B) */
> A.GROUP_CODE, B.NAME, SUM(B.AMOUNT), SUM(B.COUNT)
> FROM TABLE2 A, TABLE1 B
> WHERE A.ID = B.ID
> GROUP BY A.GROUP_CODE, B.NAME;
>
> This query finishes in about 30 hours when the machine is pretty much
> dedicated to my process (needless to say, it's too long for just the
> first step). I am looking at PL/SQL and bulk DML, but I am not sure yet
> how I could apply it in my case. Maybe, there are ways besides bulk DML
> that I could try. If you have any suggestions/thoughts etc., please let
> me know.
>
> I have Oracle 8.0.6.0.0 With Partitioning and Objects Options;
> SunOS 5.7 on 8 processors (400 Mz each), 80 Gig Memory, RAID 5 (66
> disks in total I think, but not sure).
>
> Here's some db settings:
>
> parallel_transaction_resource_timeout 300
> shared_pool_size 209715200
> shared_pool_reserved_size 37847040
> shared_pool_reserved_min_alloc 5K
> large_pool_size 0
> large_pool_min_alloc 16K
> disk_asynch_io TRUE
> dbwr_io_slaves 0
> lgwr_io_slaves 4
> arch_io_slaves 0
> db_file_direct_io_count 64
> db_block_buffers 48000
> db_block_checksum FALSE
> db_block_size 16384
> db_block_checkpoint_batch 8
> db_block_lru_statistics FALSE
> db_block_lru_extended_statistics 0
> db_writer_processes 4
> db_block_lru_latches 8
> db_block_max_dirty_target 4294967294
> db_file_simultaneous_writes 4
> db_file_multiblock_read_count 64
> mts_servers 1
> mts_max_servers 10
> mts_max_dispatchers 10
> local_listener
> mts_listener_address (ADDRESS=(PROTOCOL=ipc)(KEY=PNPKEY))
> mts_multiple_listeners FALSE
> sort_area_size 10485760
> sort_area_retained_size 65536
> sort_direct_writes TRUE
> sort_write_buffers 4
> sort_write_buffer_size 32768
> sort_spacemap_size 512
> sort_read_fac 5
> parallel_min_percent 0
> parallel_default_max_instances 0
> parallel_min_servers 7
> parallel_max_servers 32
> parallel_server_idle_time 3
> parallel_execution_message_size 2148
> parallel_min_message_pool 225540
> hash_join_enabled TRUE
> hash_area_size 13107200
> hash_multiblock_io_count 4
>
> Thank you very much!
> Sergey
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
Received on Thu Nov 09 2000 - 09:10:59 CST