Re: Slow create table statement

From: Andy Sayer <andysayer_at_gmail.com>
Date: Tue, 14 Jul 2020 19:55:32 +0100
Message-ID: <CACj1VR5kN83CKKBMR74+uxtnto7dkh078Q5sFjHyDGcSq6DqZw_at_mail.gmail.com>



No.
Your query can’t take advantage of your partitioning here. The only impact would be more partitions get read. Oracle is already able to parallelise full table scans no matter how many partitions.

If you did something like compress all but one partition then it could help. But it’s unlikely your hash partitioned table has a workload such that it makes sense for some partitions to be compressed.

Thanks,
Andy

On Tue, 14 Jul 2020 at 18:34, Ram Raman <veeeraman_at_gmail.com> wrote:

> Just a question. Would spreading the 2B rows across 64 partitions make a
> difference than keeping in 32 partitions?
>
> On Mon, Jul 13, 2020 at 9:00 PM Ram Raman <veeeraman_at_gmail.com> wrote:
>
>> Thanks everyone who responded. Jonathan, if I remove the use_hash hint,
>> it falls back to doing nested loops on the big table slowing down things. I
>> added the hint back and the performance was in15/20 min range.
>>
>> Thanks Lothar and Sayan as well. I will check them out.
>>
>> On Sat, Jul 11, 2020 at 3:47 AM Sayan Malakshinov <xt.and.r_at_gmail.com>
>> wrote:
>>
>>> It's hard to advice without knowing real physical distributions, but
>>> with old disks I'd check also one more approach:
>>>
>>> create table REQ_27333_CLN_EVNT NOLOGGING as
>>> with inc as (
>>> select/*+ no_merge */ *
>>> from (
>>> select/*+ parallel(ce, 8) no_eliminate_oby */
>>> rowid as rid
>>> from FIX.S_CLN_EVNT ce
>>> where exists(select 1 from DO2.REQ_27333_DX_ENCNTR i where
>>> ce.PERSON_SK = i.PERSON_SK)
>>> order by rid
>>> )
>>> union all select null from dual where 1=0 -- fix for parallel, may be
>>> not requires now
>>> )
>>> select /*+ PARALLEL(i, 8) */ ce.*
>>> from FIX.S_CLN_EVNT ce, inc i
>>> where ce.rowid = i.rid
>>> /
>>> If all required 14 mln rows (from 1.7bln) are located in a small part of
>>> all table blocks, there is a good chance that sorted access by rowid
>>> would be faster.
>>>
>>> On Sat, Jul 11, 2020 at 11:09 AM Lothar Flatz <l.flatz_at_bluewin.ch>
>>> wrote:
>>>
>>>> On the previous estimate (NL one) I would check the high and low values
>>>> for the join key PERSON_SK.
>>>> Is this a growing value? What is the meaning behind table
>>>> REQ_27333_DX_ENCNTR?
>>>> Wonder if the time difference for the stats make a difference on the
>>>> join estimate.
>>>>
>>>> Regards
>>>>
>>>> Lothar
>>>>
>>>> Am 11.07.2020 um 09:24 schrieb Jonathan Lewis:
>>>>
>>>>
>>>> Ram,
>>>>
>>>> I see the plan did a broadcast - which was necessary for the best
>>>> performance - but I was a little surprised to see that it didn't do a
>>>> partition iterator for the parallelism on the big table, given that the
>>>> degree was 8 and the number of partitions was 32 (viz: a multiple of 8).
>>>> However, that probably wouldn't have made very much difference. The
>>>> minimal improvement at the higher degree was probably because DOP 32
>>>> probably pushed you into queueing at the disc, it's counter-intuitive but
>>>> you might find 16 (e.g.) is better than 32.
>>>>
>>>> It looks as if someone, or something, may have gathered some stats
>>>> since the nested loop run. The cardinality estimate for the hash join is
>>>> now 15M when the estimate for the nested loop was 2.8M: if the stats hadn't
>>>> changed the cardinality estimate wouldn't change; and a factor of 5 on the
>>>> NL cost might have been enough to push it into a hash join automatically.
>>>>
>>>> Regards
>>>> Jonathan Lewis
>>>>
>>>>
>>>>
>>>> On Sat, Jul 11, 2020 at 2:14 AM Ram Raman <veeeraman_at_gmail.com> wrote:
>>>>
>>>>> Jonathan, I hinted to use HASH join, and as you said it was lot faster
>>>>> - returns in15 mins overall. Tried increasing the parallelism upto 32, but
>>>>> I only gain 2 mins with higher degrees.
>>>>> Here is the same document with the new statement towards the end:
>>>>> https://drive.google.com/file/d/1hIUHbVfBxsvJz5zC3g5ZZGRg99MPz6p7/view?usp=sharing
>>>>>
>>>>>
>>>>> Now I need to find out why it was not using the HASH in the first
>>>>> place without hint and also see if I can make it even faster.
>>>>>
>>>>> On Fri, Jul 10, 2020 at 4:57 PM Ram Raman <veeeraman_at_gmail.com> wrote:
>>>>>
>>>>>> Thanks Jonathan. There is no compression, but many columns dont have
>>>>>> values in them. I am bit puzzled by the naming convention, but that is a
>>>>>> different topic
>>>>>>
>>>>>> What you said makes good sense. It doesnt make sense to access them
>>>>>> by nested loops. Here are the big table stats:
>>>>>>
>>>>>>
>>>>>> NUM_ROWS BLOCKS LAST_ANAL AVG_ROW_LEN CHAIN_CNT
>>>>>> ------------------------ ------------ --------- ----------- ----------
>>>>>> 1,738,075,641 40,228,648 24-JUN-20 371 0
>>>>>>
>>>>>> Only one block size for the whole DB:
>>>>>>
>>>>>> SQL> sho parameter db_block_size
>>>>>> NAME TYPE
>>>>>> VALUE
>>>>>> ------------------------------------ --------------------------------
>>>>>> ------------------------------
>>>>>> db_block_size integer
>>>>>> 8192
>>>>>>
>>>>>> Partitions stats for the big table:
>>>>>>
>>>>>> PARTITION SUBPARTIT PARTITION_COUNT STATUS
>>>>>> --------- --------- --------------- --------
>>>>>> HASH NONE 32 VALID
>>>>>>
>>>>>> I did some calculation myself. If the query has to retrieve 14M rows
>>>>>> from the big table, given the average row length is 371 and hence rows per
>>>>>> block is about 22, it has to visit 634K blocks. Does not make sense to do
>>>>>> nested loop join for that. Let me try a HASH join hint
>>>>>>
>>>>>> Thanks
>>>>>> Ram.
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>> --
>>> Best regards,
>>> Sayan Malakshinov
>>> Oracle performance tuning engineer
>>> Oracle ACE Associate
>>> http://orasql.org
>>>
>>
>>
>> --
>>
>>
>>
>
> --
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jul 14 2020 - 20:55:32 CEST

Original text of this message