RE: Slow create table statement

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 10 Jul 2020 14:21:36 -0400
Message-ID: <267601d656e6$f322f6b0$d968e410$_at_rsiz.com>



A useful test is to discover whether or not things like continued rows, migrated rows, bubble space and all that is the problem or just that access is too slow.  

Now if the life pattern for a row is insert a small number of rows with key columns and then update the new rows sporadically for a while intermittent with other rows being inserted, but individual rows tend to be quiescent after a reasonably short period of time, then gains due to a rebuild MAY be semi-permanent.  

On the opposite polar extreme if row without correlation to keys or calendar time tend to have lengthening column values more or less endlessly, then any rebuild you do will be pretty perishable.  

A whole heckuva lot of business data is like the former. A whole heckuva log of scientific data is like the latter. Your mileage may vary.  

So if you do the test I propose and it make everything faster (for now), understand that may be perishable.  

The most polar case FOR the efficacy of rebuilding is when the primary partitioning is by something that correlates with “row created on date” and there is some elapsed time after creation after which updates to old rows are either forbidden or very sparse.  

That said, the whether it even works in the short term test is pretty simple:  

Pick a small enough test size to finish in a reasonable amount of time, copy the rows in the order of your most popular index order up to that size limit into a new place. The presumption is this will be slow per row.  

(Doing this test produces an actual maximum improvement metric. While it may be possible to assemble all the possible diagnostics of why retrieving rows has gotten “slow” and an analysis can be done to see if improvement is likely, the test usually takes less time and is an actual maximum.)  

Then run the select on the sample destination, forcing the index access if the existing plan uses the index.  

That, apart from the perishability of this POSSIBLE solution, will tell you if it is a POSSIBLE solution.  

IF that “works” and IF you judge your data update pattern is favorable to maintaining order, fewest blackness, and minimal row migration,

(read all those Ifs and someone please chime in if I missed an important one)  

THEN copying your data out periodically might be a solution.  

Further, if you are partitioned by something that correlates with row creation date and end of row updateability, then once you have all the old data copied into an efficient to access block storage, in the future you might maintain it by merely using the partition exchange strategy on partitions that recently have become quiescent.  

Find Tim Gorman’s paper or another good source on how to do this effectively. (I personally consider Tim’s “Scaling to Infinity” paper variants RDBMS Hall of Fame quality that should be in any “Thinking Clearly” library along with Cary’s stuff.)  

Reviewing:  

  1. It’s pretty easy to copy a usefully large but small enough to finish reasonably quickly subset to see if this can work at all.
  2. Some data insert and update patterns quickly destroy the temporary performance boost, making the original effort a waste.
  3. Some data insert and update patterns work nicely to preserve the performance boost.
  4. Even bad patterns IF constrained by time periods in some way AND coupled with matching partitioning can be repaired at low cost on a subset periodic basis such that the management overhead is minimal and worth the expense in query performance.

Good luck. This is one of several ultimate “It Depends” cases. Such efforts can be completely a waste of time and resources or they can be nearly magic.  

mwf      

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Ram Raman Sent: Friday, July 10, 2020 1:11 PM
To: Andy Sayer
Cc: ORACLE-L; Tim Gorman
Subject: Re: Slow create table statement  

Thanks Andy. I am not sure about a covering index as the table has around 50 columns, with 30 of them VARCHAR2(4000)! Not sure if creating a covering index on such a table with 2B rows a good idea.  

I did some tests and as you pointed out the count(*) comes back fast, but when i do a SELECT of all columns instead of count from the big table the query slows down to about 3 hours like the create table. I guess it is time for new hardware.    

On Wed, Jul 8, 2020 at 2:09 AM Andy Sayer <andysayer_at_gmail.com> wrote:

Hi Ram,  

Covering index means that it covers every column that’s required by the query (even the ones in your select). Judging by the cost in the plan, it’s unlikely the index used contained more than one column. Covering indexes remove the need for table access. You would need to weigh up the cost of creating an index the size of the table and the reward (this query gets faster).  

Thanks,

Andrew  

On Wed, 8 Jul 2020 at 03:25, Ram Raman <veeeraman_at_gmail.com> wrote:

Thank you Tim and Andy,  

" besides making a covering index" Andy, there is a covering index which makes the access of CLN_EVNT table by index, per the plan  

On Tue, Jul 7, 2020 at 5:39 PM Andy Sayer <andysayer_at_gmail.com> wrote:

Tim,  

"Load as select" means it's doing direct path load. The additional step of "OPTIMIZER STATISTICS GATHERING" also only appears for direct path load.

We can also tell that the insertion into the table is done in parallel as we are still using PCWP in the IN-OUT column, it only serializes about the load.  

Thanks,

Andy  

On Tue, 7 Jul 2020 at 23:25, Tim Gorman <tim.evdbt_at_gmail.com> wrote:

Ram,

Perhaps it is serializing on the INSERT phase of the CTAS, keeping in mind that the CREATE clause both creates the table and does the insert? Perhaps...

        ALTER SESSION ENABLE PARALLEL DDL;
        CREATE TABLE ... PARALLEL 8 SELECT /*+ full(ce) parallel(ce,8) */ CE.* FROM ... ;

Unless you are required to perform CTAS in one statement, could you perhaps split it into two steps, the first to create the table and the second to populate it? So as an alternative...

        CREATE TABLE .... SELECT * FROM IX.S_CLN_EVNT WHERE 1 = 0;
        ALTER SESSION ENABLE PARALLEL DML;
        INSERT /*+ APPEND PARALLEL(x,8) */ INTO REQ_27333_CLN_EVENT SELECT /*+ full(ce) parallel(ce,8) */ CE.* FROM ... ;

This way you can use the APPEND hint to force direct-path INSERTs which are far more efficient than conventional INSERTs; I'm not sure if LOAD AS SELECT (HYBRID TSM/HWMB)" is direct-path or conventional? At any rate, with an APPEND hint you can be certain. You'll also get to specify the PARALLEL hint on the INSERT statement as well as the SELECT statement, to make sure of PX pipelining. Also, not sure if your session is enabled for parallel DDL or parallel DML, but if you break it into two operations, you will only need parallel DML.

Hope this helps?

-Tim

On 7/7/2020 2:48 PM, Ram Raman wrote:

Hi,  

We are trying to create a table by pulling information from a big table, the create statement is taking too long. However when I run select count(*) with the same query it comes back in seconds. While the CREATE statement is running I see waits on parallel read almost all the time, however, I do not see the same waits during select count(*).

Tried a few variations with parallelism degree in the Create statement. Also tried inserting after truncating the table, used INSERT with APPEND hint, same slow results. v12c. FORCE_LOGGING is not on at DB level, inserted table in nologging mode. During the run, the IO on the data disks is about 50 to 70% util. 2 cores are consistently at 99-100% (most of them are %iowait at ~95%) with the other cores idle during this run the last time I checked - I only checked the individual core usage this last time it was run. I was checking average before. The wait events are the same in all cases.  

The attached link <https://drive.google.com/file/d/1N8re2eAWLl0FHYdGs0c9I93e5ruoSNN6/view?usp=sharing> has all the commands and the information. Any help is appreciated.  

Thanks,

Ram.  

--
     

--
     

--
   

--

http://www.freelists.org/webpage/oracle-l Received on Fri Jul 10 2020 - 20:21:36 CEST

Original text of this message