Re: Table update is taking too much time

From: Lothar Flatz <l.flatz_at_bluewin.ch>
Date: Tue, 31 May 2016 20:12:59 +0200
Message-ID: <6ec4ea6e-75e1-812c-75a7-0108d854a261_at_bluewin.ch>



Hi Mark,

very good reply. Agree on working on partition level (what else? ) I disagree a bit on the DOP. If the partition is big enough (probably true for the child tables) , parallel works fine, but with a lower degree. However you could probably do more than one partition at a time doing dbms_parallel_execute.

regards

Lothar

On 31.05.2016 19:42, Mark W. Farnham wrote:
>
> Search up Tim Gorman’s paper on the virtual cycle and scaling to
> infinity and read the bits about partition exchange.
>
> Search up my paper about when physical ordering makes sense.
>
> Now apart from those considerations, since you have 3 times 85
> partitions you can probably come up with a number of partitions to be
> working on concurrently.
>
> You’ll be creating copies of each partition, not the whole table. I
> cannot recommend details of minimizing time to revert if you make a
> horrible error without knowing more things like the relationship of
> your partitions to tablespaces. (Some organization with a relationship
> between age of the data in a partition and tablespaces is usually
> useful for the life cycle management of tablespaces and whether you
> can make them read only and get a few safe backups of them to reload
> with only a tiny amount of recovery as files.)
>
> Dual partitioning of time, account range (id in your case, I think)
> would probably be useful for query speed in the resulting image of
> this information. That would require significant thought and knowledge
> of the queries that dominate your use of these tables. But if your
> queries and data are like the preponderance I see, and you partition
> by some variety of time and account and physically order each
> partition as it becomes a frozen record of history by account and
> time (not time and account that is the partition prune, but rather
> account then time within each partition so that you minimize the
> number of blocks that need to be visited to get everything about a
> particular account) then probably you’ll get something at least on the
> order of an optimal solution. (Some variety of clusters might also fit
> the bill, but that is quite a departure from what you’re already
> doing, so I’ll let others weigh in on that.)
>
> Now, regarding Lothar’s advice about DOP, if you are doing several
> partitions at once, then you probably want to do each of them at DOP
> 1. (His advice is notably NOT WRONG, but I’m pretty sure he was
> referencing the challenge as three monoliths, not partitioned
> tables.) DOP 1 skips the reassembly work of the coordinator and the
> workers, and since you’ll decide on enough partitions to do in
> parallel to keep the machine very busy that replaces the DOP on
> utilizing the hardware. IF storage limitations mean you cannot do a
> sufficient number of partitions in parallel to keep the machine busy,
> then DO USE Lothar’s advice about DOP.
>
> You may choose to create a new destination with dual partitioning,
> which would be a slight wrinkle on the partition exchange bit.
>
> Good luck,
>
> mwf
>
> *From:*oracle-l-bounce_at_freelists.org
> [mailto:oracle-l-bounce_at_freelists.org] *On Behalf Of *Ashoke Mandal
> *Sent:* Tuesday, May 31, 2016 12:35 PM
> *To:* Deas, Scott
> *Cc:* ORACLE-L
> *Subject:* Re: Table update is taking too much time
>
> Hi Scott, Lohar and Brent,
>
> Thanks for your tips.
>
> I have some questions before I try the insert method.
>
> 1. All three tables are partitioned by quarter with test_date. We have
> data from 1992. The partitions were created from 1992 through 1996 by
> year meaning there was only one partition for each year. But from 1997
> through 2016 we have one partition for every quarter. So there are 85
> partitions on each of these tables. Both of the child tables contain
> multiple records corresponding to each record on the master table.
>
> 2. For some reason even a simple select statement is taking long time
> on each of these tables.
>
> 2.1 select count(*) from master_table; It takes 11 min => Records to
> be updated in master_table: 2437461 (total records in this table is
> 222,760,096)
>
> 2.2 select count(*) from child_table_1; Takes 1 hour => Records to
> be updated in child_table_1: 370569916 (total records in this table is
> 20,355,000,000)
>
> 2.3 select count(*) from child_table_2; Takes 3 hours => Records to be
> updated in child_table_2: 4594843 (total records in this table is
> 1,843,018,199)
>
> 3. If I create a temporary copy of each of these tables using CTAS it
> would create a regular table not the partitioned table. So I could
> generate create script for each of these tables using TOAD or SQL
> Developer, then expand 255 (85*3) tablespaces to it's double the size
> to host two copies of these tables. I am kind of worried to touch
> these partitioned tables. If something goes wrong then recovery of
> this 3TB database will be time consuming.
>
> Do you think there is some issue with my tables which is causing to
> take the simple queries so long (as mentioned above)?
>
> Could I speed up the update process using my PL/SQL code since that
> seems safer as I don't need to touch the entire data and don't need to
> rebuild index on 1 TB size of the biggest table.
>
> Ashoke
>
> On Sun, May 29, 2016 at 5:17 PM, Deas, Scott <Scott.Deas_at_lfg.com
> <mailto:Scott.Deas_at_lfg.com>> wrote:
>
> Ashoke,
>
> I assume you have some sort of maintenance window allocated for this
> work, so my recommendation would be to not do the update in place, but
> rather to do something like the following:
>
> 1.CTAS for each table you’re changing to create an empty copy of the
> table with the nologging option
>
> 2.Create a lookup table where you store original ut_id and new ut_id
>
> 3.Parallel, direct path insert from original table (joining to your
> control table to get the new ut_it) into new tables
>
> 4.Build all indexes/constraints
>
> 5.Verify data
>
> 6.Enable logging on tables/indexes
>
> 7.Rename objects so new tables now have the original table names
>
> All of this assumes you aren’t changing in these objects.
>
> You may also want to look at this whitepaper from Bryn Llewellyn that
> discusses using PL/SQL to transform one table into another, when it’s
> changing.
> https://blogs.oracle.com/plsql-and-ebr/resource/Transforming_One_Table_to_Another_Sql_or_Plsql_Whitepaper.pdf
>
> Thanks,
>
> Scott
>
> *From:*oracle-l-bounce_at_freelists.org
> <mailto:oracle-l-bounce_at_freelists.org>
> [mailto:oracle-l-bounce_at_freelists.org
> <mailto:oracle-l-bounce_at_freelists.org>] *On Behalf Of *Ashoke Mandal
> *Sent:* Sunday, May 29, 2016 5:42 PM
> *To:* ORACLE-L <oracle-l_at_freelists.org <mailto:oracle-l_at_freelists.org>>
> *Subject:* Table update is taking too much time
>
> Dear All, Let me know if you have any suggestions to make this update
> faster as six days is too high.
>
> Background: Due to some unexpected reason we have same set of test_ids
> in the table for two different testing facilities. We are trying to
> change the ut_ids of one facility to resolve the duplication issue
> using the following code. It works but taking *6* days to complete the
> update and looking for tips to make the update faster. The ut_id is
> part of the primary key in all three tables (master_table,
> child_table_1 and child_table_2).
>
> Records to be updated in master_table: 2437461 (total records in this
> table is 222760096)
>
> Records to be updated in child_table_1: 370569916
>
> Records to be updated in child_table_2: 4594843
>
> Using following UNIX shell script and a stored procedure:
>
> *=> Calling shell script code:*
>
> export PATH=$PATH:/opt/bin:/usr/ccs/bin:/usr/sbin:/var/opt/oracle
>
> export DMAS_ROOT=/med_oracle/dmas/production
>
> export start_utid=$1
>
> export end_utid=$2
>
> export ORACLE_SID=$3
>
> # Set oracle environment for the specified database ($3)
>
> export ORAENV_ASK=NO
>
> . /usr/local/bin/oraenv
>
> sqlplus user/password <<EOF1
>
> -- Disable the Foreign key constraints on the child tables
>
> ALTER TABLE CHILD_TABLE_1 DISABLE CONSTRAINT
> CHILD_TABLE_1_FK_UT_ID_TEST_DATE;
>
> ALTER TABLE CHILD_TABLE_2 DISABLE CONSTRAINT TEST_NOTE_FK_UT_ID_TEST_DATE;
>
> TRUNCATE TABLE master_table_temp;
>
> INSERT INTO master_table_temp select * from master_table where
> (ut_id>= ${start_utid} and ut_id <= ${end_utid}) and
> test_facility_code <> 'E';
>
> -- Call the procedure to update the UT_Ids
>
> EXECUTE proc_update_utid ('$1','$2');
>
> -- Enable the foreign key constarints
>
> ALTER TABLE CHILD_TABLE_1 ENABLE NOVALIDATE CONSTRAINT
> child_table_1_FK_UT_ID_TEST_DATE;
>
> ALTER TABLE TEST_NOTE ENABLE NOVALIDATE CONSTRAINT
> child_table_2_FK_UT_ID_TEST_DATE;
>
> EXIT;
>
> EOF1
>
> *=> Called stored procedure*
>
> create or replace PROCEDURE proc_update_utid(start_utid number,
> end_utid number) AS
>
> lv_rec_count number :=0;
>
> lv_update_number number :=0;
>
> lv_total_rec_cnt number :=0;
>
> lv_last_utid number;
>
> lv_logfile varchar2(40);
>
> lv_db_name varchar2(8);
>
> lv_stmt varchar2(1000);
>
> output_file UTL_FILE.FILE_TYPE;
>
> -- Declare a cursor on the impacted records stored in the temporary
> working table
>
> cursor ut_cur is select * from master_table_temp order by ut_id;
>
> begin
>
> dbms_output.enable(10000000); -- setup of test output
>
> -- Get the database name
>
> select name into lv_db_name from v$database;
>
> lv_logfile := lv_db_name || '_' || 'update.log';
>
> output_file := UTL_FILE.FOPEN('/med_oracle/dmas/production/log',
> lv_logfile, 'A');
>
> utl_file.put_line(output_file,'Start Time: '|| to_char(sysdate,
> 'DD-MON-YYYY HH24:MI:SS'));
>
> -- Check the number of duplicate UT_IDs to be updated
>
> select count(*) into lv_total_rec_cnt from master_table
>
> where (ut_id >= start_utid and ut_id <= end_utid) and
> test_facility_code <> 'E';
>
> utl_file.put_line(output_file,'Total number of duplicate UT_ID before
> the update: '|| lv_total_rec_cnt);
>
> --
>
> for ur_rec in ut_cur
>
> loop
>
> -- dbms_output.put_line ('This is right before the update of unit_test
> table');
>
> update master_table set ut_id=ut_id+9000000000 where
> ut_id=ur_rec.ut_id and
>
> test_facility_code <> 'E';
>
> -- dbms_output.put_line ('This is right before the update of
> child_table_1 table');
>
> update child_table_1 set ut_id=ut_id+9000000000 where
> child_table_1.ut_id=ur_rec.ut_id and
>
> test_date=ur_rec.test_date;
>
> update child_table_2 set ut_id=ut_id+9000000000 where
> child_table_2.ut_id=ur_rec.ut_id and
>
> test_date=ur_rec.test_date;
>
> test_date=ur_rec.test_date;
>
> lv_rec_count := lv_rec_count+1;
>
> if lv_rec_count = 1000
>
> then
>
> lv_update_number := lv_update_number+1;
>
> -- insert a row to the update_status table
>
> insert into update_status values(lv_update_number, ur_rec.ut_id, sysdate);
>
> commit;
>
> lv_rec_count := 0;
>
> else
>
> lv_last_utid := ur_rec.ut_id;
>
> end if;
>
> end loop;
>
> -- Following code is to address the last set of rows, which are less
> than 1000.
>
> if lv_rec_count != 1000
>
> then
>
> lv_update_number := lv_update_number+1;
>
> -- insert a row to the update_status table
>
> insert into update_status values(lv_update_number, lv_last_utid, sysdate);
>
> -- dbms_output.put_line ('This is right before the second commit');
>
> commit;
>
> end if;
>
> -- Verify if the UT_ID between start_utid and end_utid range with
> facility_code <> 'E' are
>
> -- all assigned to a different UT_ID. If that is true then the
> following query shouldn't return any row.
>
> select count(*) into lv_total_rec_cnt from master_table
>
> where (ut_id >= start_utid and ut_id <= end_utid) and
> test_facility_code <> 'E';
>
> dbms_output.put_line ('The total duplicate UT_IDs in master_table is '
> ||lv_total_rec_cnt);
>
> --
>
> utl_file.put_line(output_file,'Total number of duplicate UT_ID after
> the update: '|| lv_total_rec_cnt);
>
> utl_file.put_line(output_file,'End Time: '|| to_char(sysdate,
> 'DD-MON-YYYY HH24:MI:SS'));
>
> utl_file.fclose(output_file);
>
> end proc_update_utid;
>
> Thanks,
>
> Ashoke
>
> Notice of Confidentiality: **This E-mail and any of its attachments
> may contain
> Lincoln National Corporation proprietary information, which is
> privileged, confidential,
> or subject to copyright belonging to the Lincoln National Corporation
> family of
> companies. This E-mail is intended solely for the use of the
> individual or entity to
> which it is addressed. If you are not the intended recipient of this
> E-mail, you are
> hereby notified that any dissemination, distribution, copying, or
> action taken in
> relation to the contents of and attachments to this E-mail is strictly
> prohibited
> and may be unlawful. If you have received this E-mail in error, please
> notify the
> sender immediately and permanently delete the original and any copy of
> this E-mail
> and any printout. Thank You.**
>

-- 





--
http://www.freelists.org/webpage/oracle-l
Received on Tue May 31 2016 - 20:12:59 CEST

Original text of this message