Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Bulk DML question ...

Re: Bulk DML question ...

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Tue, 7 Nov 2000 21:31:27 -0000
Message-ID: <973628946.28797.3.nnrp-12.9e984b29@news.demon.co.uk>

I should also have asked:

    If two rows have the same name
    do they have the same ID.

Still, assuming they do, this should
give you the same result, and may
be quicker because the sort/group
is on an narrower set, and the join
is between fewer rows:

(e and o.e. in the text)

CREATE TABLE TEMP_TABLE PARALLEL (DEGREE 8 INSTANCES 1) NOLOGGING AS
select

    a.group_code,

    il.name
    il.sum_amount,
    il.sum_count

    (
    select /*+ no_merge */
        b.id,
        b.name,
        sum(b.amount)    sum_amount,
        sum(b.count)        sum_count            --  how did you get COUNT
as a column name ?

    from

        table1 b
    group by b.id, b.name
    ) il,
    table2 a
where

    a.id = il.id
;

--

Jonathan Lewis
Yet another Oracle-related web site:  http://www.jlcomp.demon.co.uk

Practical Oracle 8i:  Building Efficient Databases
Publishers:  Addison Wesley Longman
Book bound date now 8th Dec 2000

sergey_s_at_my-deja.com wrote in message <8u7u1j$igr$1_at_nnrp1.deja.com>...

>
>> A coupld of quick question for this query though, is ID
>> unique on the smaller table ?
>
>Yes, it is.
>
>>Is the drop in row count
>> largely due to repeated group_codes or repeated names ?
>
>Due to repeated names.
>
>
>Thank you!
>Sergey
>
>
>In article <973373984.28182.0.nnrp-02.9e984b29_at_news.demon.co.uk>,
> "Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote:
>>
>> Aggregating 3Billion rows down to 1.2B rows
>> is bound to be very slow. It is possible that
>> your strategy of breaking the aggregation into
>> steps needs to be reviewed so that you optimize
>> a step which has a higher density of aggregation.
>>
>> A coupld of quick question for this query though, is ID
>> unique on the smaller table ? Is the drop in row count
>> largely due to repeated group_codes or repeated names ?
>>
>> --
>>
>> Jonathan Lewis
>> Yet another Oracle-related web site: http://www.jlcomp.demon.co.uk
>>
>> Practical Oracle 8i: Building Efficient Databases
>> Publishers: Addison Wesley Longman
>> Book bound date now 1st Dec 2000
>>
>> sergey_s_at_my-deja.com wrote in message <8tv6in$ta4$1_at_nnrp1.deja.com>...
>> >Yes, the explain plan shows hash join and every step is
>> >parallel-to-parallel (except for the collection at the end, of course
-
>> >that one's parallel-to-serial). The query creates a "processing
table"
>> >(approx. 1.2 billion rows less than the base table). This processing
>> >table is later queried to create more temporary tables which in turn
are
>> >used to create the final aggregate tables. Basically, the aggregation
>> >process was broken down into smaller steps for performance reasons
(we
>> >achieved 30% reduction in time).
>> >
>> >Thank you!
>> >Sergey
>> >
>> >
>> >In article <8tuua4$lf8$1_at_nnrp1.deja.com>,
>> > jdarrah_co_at_my-deja.com wrote:
>> >> Does the explain plan show that its using hash and that the
statement
>> >> is running in parrallel? Also, what's the next step after you
create
>> >> the table?
>> >>
>> >> In article <8tqg61$15b$1_at_nnrp1.deja.com>,
>> >> sergey_s_at_my-deja.com wrote:
>> >> > Let's say I have two tables:
>> >> >
>> >> > CREATE TABLE1
>> >> > (
>> >> > TIME_KEY NUMBER,
>> >> > ID NUMBER,
>> >> > NAME VARCHAR2(20),
>> >> > AMOUNT NUMBER(8,2),
>> >> > COUNT NUMBER,
>> >> > ...
>> >> > );
>> >> >
>> >> > CREATE TABLE2
>> >> > (
>> >> > ID NUMBER,
>> >> > GROUP_CODE NUMBER,
>> >> > ...
>> >> > );
>> >> >
>> >> > TABLE1 has approx. 3 billion rows (almost 200 Gig.). TABLE2 has
approx.
>> >> > 3 million rows. TABLE1 is partitioned by TIME_KEY (partitions
have
>> >> > anywhere from 50 to 100 million rows each). I looked into making
>> >> > partitions smaller based on additional criteria, but the next
candidate
>> >> > column has only about 500 unique values.
>> >> >
>> >> > Every row in TABLE1 has a match in TABLE2. In other words, I have
no
>> >> > choice but to do full table scans. I already looked into business
>> >> > process, and there is no time or money to change it at this
point.
>> >> >
>> >> > So, here's the query I need to run:
>> >> >
>> >> > CREATE TABLE TEMP_TABLE PARALLEL (DEGREE 8 INSTANCES 1) NOLOGGING
AS
>> >> > SELECT /*+ ORDERED FULL(A) FULL(B) USE_HASH(B) */
>> >> > A.GROUP_CODE, B.NAME, SUM(B.AMOUNT), SUM(B.COUNT)
>> >> > FROM TABLE2 A, TABLE1 B
>> >> > WHERE A.ID = B.ID
>> >> > GROUP BY A.GROUP_CODE, B.NAME;
>> >> >
>> >> > This query finishes in about 30 hours when the machine is pretty
much
>> >> > dedicated to my process (needless to say, it's too long for just
the
>> >> > first step). I am looking at PL/SQL and bulk DML, but I am not
sure yet
>> >> > how I could apply it in my case. Maybe, there are ways besides
bulk DML
>> >> > that I could try. If you have any suggestions/thoughts etc.,
please let
>> >> > me know.
>> >> >
>> >> > I have Oracle 8.0.6.0.0 With Partitioning and Objects Options;
>> >> > SunOS 5.7 on 8 processors (400 Mz each), 80 Gig Memory, RAID 5
(66
>> >> > disks in total I think, but not sure).
>> >> >
>> >> > Here's some db settings:
>> >> >
>> >> > parallel_transaction_resource_timeout 300
>> >> > shared_pool_size 209715200
>> >> > shared_pool_reserved_size 37847040
>> >> > shared_pool_reserved_min_alloc 5K
>> >> > large_pool_size 0
>> >> > large_pool_min_alloc 16K
>> >> > disk_asynch_io TRUE
>> >> > dbwr_io_slaves 0
>> >> > lgwr_io_slaves 4
>> >> > arch_io_slaves 0
>> >> > db_file_direct_io_count 64
>> >> > db_block_buffers 48000
>> >> > db_block_checksum FALSE
>> >> > db_block_size 16384
>> >> > db_block_checkpoint_batch 8
>> >> > db_block_lru_statistics FALSE
>> >> > db_block_lru_extended_statistics 0
>> >> > db_writer_processes 4
>> >> > db_block_lru_latches 8
>> >> > db_block_max_dirty_target 4294967294
>> >> > db_file_simultaneous_writes 4
>> >> > db_file_multiblock_read_count 64
>> >> > mts_servers 1
>> >> > mts_max_servers 10
>> >> > mts_max_dispatchers 10
>> >> > local_listener
>> >> > mts_listener_address (ADDRESS=(PROTOCOL=ipc)(KEY=PNPKEY))
>> >> > mts_multiple_listeners FALSE
>> >> > sort_area_size 10485760
>> >> > sort_area_retained_size 65536
>> >> > sort_direct_writes TRUE
>> >> > sort_write_buffers 4
>> >> > sort_write_buffer_size 32768
>> >> > sort_spacemap_size 512
>> >> > sort_read_fac 5
>> >> > parallel_min_percent 0
>> >> > parallel_default_max_instances 0
>> >> > parallel_min_servers 7
>> >> > parallel_max_servers 32
>> >> > parallel_server_idle_time 3
>> >> > parallel_execution_message_size 2148
>> >> > parallel_min_message_pool 225540
>> >> > hash_join_enabled TRUE
>> >> > hash_area_size 13107200
>> >> > hash_multiblock_io_count 4
>> >> >
>> >> > Thank you very much!
>> >> > Sergey
>> >> >
>> >> > Sent via Deja.com http://www.deja.com/
>> >> > Before you buy.
>> >> >
>> >>
>> >> Sent via Deja.com http://www.deja.com/
>> >> Before you buy.
>> >>
>> >
>> >
>> >Sent via Deja.com http://www.deja.com/
>> >Before you buy.
>>
>>
>
>
>Sent via Deja.com http://www.deja.com/
>Before you buy.
Received on Tue Nov 07 2000 - 15:31:27 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US