Re: Slow create table statement

From: Sayan Malakshinov <xt.and.r_at_gmail.com>
Date: Sat, 11 Jul 2020 11:43:16 +0300
Message-ID: <CAOVevU6=WjCWEzHaFuvGbZMbckwtoF8k33dq_-KPXoWNe3ry+Q_at_mail.gmail.com>



It's hard to advice without knowing real physical distributions, but with old disks I'd check also one more approach:

create table REQ_27333_CLN_EVNT NOLOGGING as with inc as (
  select/*+ no_merge */ *
  from (

     select/*+ parallel(ce, 8) no_eliminate_oby */
      rowid as rid

   from FIX.S_CLN_EVNT ce
   where exists(select 1 from DO2.REQ_27333_DX_ENCNTR i where ce.PERSON_SK = i.PERSON_SK)

   order by rid
  )
   union all select null from dual where 1=0 -- fix for parallel, may be not requires now
)
select /*+ PARALLEL(i, 8) */ ce.*
from FIX.S_CLN_EVNT ce, inc i
where ce.rowid = i.rid
/
If all required 14 mln rows (from 1.7bln) are located in a small part of all table blocks, there is a good chance that sorted access by rowid would be faster.

On Sat, Jul 11, 2020 at 11:09 AM Lothar Flatz <l.flatz_at_bluewin.ch> wrote:

> On the previous estimate (NL one) I would check the high and low values
> for the join key PERSON_SK.
> Is this a growing value? What is the meaning behind table
> REQ_27333_DX_ENCNTR?
> Wonder if the time difference for the stats make a difference on the join
> estimate.
>
> Regards
>
> Lothar
>
> Am 11.07.2020 um 09:24 schrieb Jonathan Lewis:
>
>
> Ram,
>
> I see the plan did a broadcast - which was necessary for the best
> performance - but I was a little surprised to see that it didn't do a
> partition iterator for the parallelism on the big table, given that the
> degree was 8 and the number of partitions was 32 (viz: a multiple of 8).
> However, that probably wouldn't have made very much difference. The
> minimal improvement at the higher degree was probably because DOP 32
> probably pushed you into queueing at the disc, it's counter-intuitive but
> you might find 16 (e.g.) is better than 32.
>
> It looks as if someone, or something, may have gathered some stats since
> the nested loop run. The cardinality estimate for the hash join is now 15M
> when the estimate for the nested loop was 2.8M: if the stats hadn't changed
> the cardinality estimate wouldn't change; and a factor of 5 on the NL cost
> might have been enough to push it into a hash join automatically.
>
> Regards
> Jonathan Lewis
>
>
>
> On Sat, Jul 11, 2020 at 2:14 AM Ram Raman <veeeraman_at_gmail.com> wrote:
>
>> Jonathan, I hinted to use HASH join, and as you said it was lot faster -
>> returns in15 mins overall. Tried increasing the parallelism upto 32, but I
>> only gain 2 mins with higher degrees.
>> Here is the same document with the new statement towards the end:
>> https://drive.google.com/file/d/1hIUHbVfBxsvJz5zC3g5ZZGRg99MPz6p7/view?usp=sharing
>>
>>
>> Now I need to find out why it was not using the HASH in the first place
>> without hint and also see if I can make it even faster.
>>
>> On Fri, Jul 10, 2020 at 4:57 PM Ram Raman <veeeraman_at_gmail.com> wrote:
>>
>>> Thanks Jonathan. There is no compression, but many columns dont have
>>> values in them. I am bit puzzled by the naming convention, but that is a
>>> different topic
>>>
>>> What you said makes good sense. It doesnt make sense to access them by
>>> nested loops. Here are the big table stats:
>>>
>>>
>>> NUM_ROWS BLOCKS LAST_ANAL AVG_ROW_LEN CHAIN_CNT
>>> ------------------------ ------------ --------- ----------- ----------
>>> 1,738,075,641 40,228,648 24-JUN-20 371 0
>>>
>>> Only one block size for the whole DB:
>>>
>>> SQL> sho parameter db_block_size
>>> NAME TYPE
>>> VALUE
>>> ------------------------------------ --------------------------------
>>> ------------------------------
>>> db_block_size integer
>>> 8192
>>>
>>> Partitions stats for the big table:
>>>
>>> PARTITION SUBPARTIT PARTITION_COUNT STATUS
>>> --------- --------- --------------- --------
>>> HASH NONE 32 VALID
>>>
>>> I did some calculation myself. If the query has to retrieve 14M rows
>>> from the big table, given the average row length is 371 and hence rows per
>>> block is about 22, it has to visit 634K blocks. Does not make sense to do
>>> nested loop join for that. Let me try a HASH join hint
>>>
>>> Thanks
>>> Ram.
>>>
>>>
>>>
>

-- 
Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE Associate
http://orasql.org

--
http://www.freelists.org/webpage/oracle-l
Received on Sat Jul 11 2020 - 10:43:16 CEST

Original text of this message