Re: Full table scan -- uniform extent allocation - extent allocation map issue?

From: Vishnu Potukanuma <vishnupotukanuma_at_gmail.com>
Date: Tue, 26 Nov 2019 22:00:22 +0530
Message-ID: <CAP-Rywxjaneht=AQRN5OwFXkVw4SSSGHgt2MBuDebdF45RWN4g_at_mail.gmail.com>



Hi Jonathan,

The trace file indicate for autoallocate - yes.. i must have pasted the old one when i was testing auto allocate (in auto allocate this issue diminishes as the extent sizes become large)

Oracle allocates extents to a segment at any allocation in the data file (uniform size) when the demand for space arises which is logical thing but as oracle reads extents in the order they are allocated during the full table scans rather than where they are stored, the IO becomes random rather than sequential given time when extent sizes are low and the workload mainly contains frequent segment allocation and deallocation and that if other sessions doing index range scans saturate IOPS at the storage layer especially HDDs as these are random in nature, reduces the overall throughput of the full table scan...

I mean when the full table scans are slow, usually we end up looking at various things, but could be a factor in very rare cases (OLTP) and could be common in warehouses where segments are frequently dropped or created, and this random extent allocation can grow right under our nose and we never detect it. There are various other scenarios, edge cases i am looking at, but i am really interested as to why a session blindly reads the extents in order they are allocated rather than in the order they are stored based on L1DBA.

Please give me sometime (have to get dinner before its too late), i will send the complete test case with which this thing can be replicated.

Thanks,
Vishnu

On Tue, Nov 26, 2019 at 8:58 PM Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk> wrote:

> insert into table10 select * From table1;
> drop table1;
>
>
> --- create a tablespace with 350mb in size and uniform extent allocation,
>
> The extent information you've supplied shows us that you've created the
> final table in a tablespace using autoallocate - the early extents are
> 64KB, the next 1MB, the last few you've shown are 8MB.
>
> If you want to supply a test case that has a chance of reproducing then
> you need to tell us WHAT size uniform size, preferably giving us a script
> that allows us change a couple of names (tablespace / file) and create the
> tablespace.
>
> -- create 10 tables with 30 mb in size.
>
> Needs one sample Create table script - did you create it with "initial
> 30M" (obviously not); did you create it with "create as select ...", did
> you do anything odd with pctfree; how did you populate it. Do you need the
> tables to allocate 30MB, or should the highwater mark be pushed to 30MB as
> well.
>
> -- insert into table10 select * From table1;
>
> That's probably going to work - because we'd probably assume you wanted
> all 10 tables to look alike.
>
> -- drop table1;
>
> That should work, but do you have your recyclebin enabled - that may
> affect where the next insert puts the data (especially if some doesn't
> follow your instruction that the tablespace should be 350MB).
>
> -- similarly perform the same randomly
>
> If my results don't match your results, might that be because your results
> are highlighting a flaw in your system, or because I didn't choose the same
> random order as you did ? Don't supply a "random" test if you want to
> demonstrate a reproducible anomaly. Tell us the exact order you used to
> insert and drop.
>
>
> -- interestingly the fix is to read the extents one after the other based
> on the L1 DBA rather than the order in which they are allocated to the
> segment
>
> Are you saying you think this is what Oracle does, or what you think
> Oracle should do ?
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
>
>
>
>
> ________________________________________
> From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on
> behalf of Vishnu Potukanuma <vishnupotukanuma_at_gmail.com>
> Sent: 26 November 2019 14:12
> To: Oracle L
> Subject: Full table scan -- uniform extent allocation - extent allocation
> map issue?
>
> Hi,
>
> I have observed an interesting behaviour today, while working on other
> aspects and my hunch appears to be true ,,, I will start by saying that
> there are too many variables and scenarios where things can be optimal but
> i am talking about the worst case scenario where the segments are
> frequently created and dropped in a tablespace.
>
> This works by exploiting certain aspects of the database such as how
> oracle maintains extent map in the segment header and how the extent
> allocation is done or how the full table scan reads the segment. Inherently
> in this case there is nothing we can do as there are too many variables.
> The impact may be small in terms of the overall response times but this
> typically reduces the lifespan of the disks as the disk spindles ends up
> moving from higher to lower tracks and vice versa over and over again...
> instead of just in a simple arc from higher tracks to lower tracks. in
> worst case this could add delay when the segment is very large and if the
> extent sizes are small (this probability of this occuring is very less but
> there is still a chance).
>
> when it comes to oracle while performing full table scans, the process
> starts by reading the extent map and starts reading extents one by one
> until the last extent or high water mark is reached... but what if the
> extents allocated to the segment are stored in random areas just like
> clustering_factor? interestingly the fix is to read the extents one after
> the other based on the L1 DBA rather than the order in which they are
> allocated to the segment, in most cases, oracle does a pretty good job
> allocating extents close to each other to the segment, but things can get
> messy.
>
> this is the simplest way to explain and this goes gradually from here and
> we can make however we like it.
> create a tablespace with 350mb in size and uniform extent allocation,
> create 10 tables with 30 mb in size.
> insert into table10 select * From table1;
> drop table1;
> similarly perform the same randomly so that the final end result is only
> one table. here we are just simulating a case where the subsequent extents
> allocated to the segment are on different regions randomly in a datafile,
> thus their locations are on the disk as well (which could be or could not
> be based on several factors such as how fragmented the disks are etc, i
> wont go in that space as it adds too much complexity).. here we perform a
> full table scan, oracle reads the segment header to read the extent map and
> reads the extents one by one...
>
>
> Notice the block numbers in the traces.
> WAIT #139711646963136: nam='db file sequential read' ela= 612 file#=5
> block#=71810 blocks=1 obj#=75252 tim=248445129880
> WAIT #139711646963136: nam='db file scattered read' ela= 763 file#=5
> block#=71811 blocks=5 obj#=75252 tim=248445130757
> WAIT #139711646963136: nam='db file scattered read' ela= 771 file#=5
> block#=71816 blocks=8 obj#=75252 tim=248445131732
> ....
> ....
> WAIT #139711646963136: nam='db file scattered read' ela= 1941 file#=5
> block#=44736 blocks=64 obj#=75252 tim=248445270874
> WAIT #139711646963136: nam='db file scattered read' ela= 1788 file#=5
> block#=44802 blocks=62 obj#=75252 tim=248445273696
> WAIT #139711646963136: nam='db file scattered read' ela= 1902 file#=5
> block#=44864 blocks=64 obj#=75252 tim=248445276573
> WAIT #139711646963136: nam='db file scattered read' ela= 1756 file#=5
> block#=44930 blocks=62 obj#=75252 tim=248445279335
> ....
>
> WAIT #139711646963136: nam='db file scattered read' ela= 3078 file#=5
> block#=132 blocks=128 obj#=75252 tim=248445887737
> WAIT #139711646963136: nam='db file scattered read' ela= 3045 file#=5
> block#=260 blocks=124 obj#=75252 tim=248445893065
> WAIT #139711646963136: nam='db file scattered read' ela= 3399 file#=5
> block#=384 blocks=128 obj#=75252 tim=248445898641
> ....
> ...
> WAIT #139711646963136: nam='db file scattered read' ela= 12269 file#=5
> block#=97540 blocks=124 obj#=75252 tim=248446182677
> WAIT #139711646963136: nam='db file scattered read' ela= 17440 file#=5
> block#=97664 blocks=128 obj#=75252 tim=248446202405
> WAIT #139711646963136: nam='db file scattered read' ela= 3282 file#=5
> block#=97792 blocks=128 obj#=75252 tim=248446208027
> ...
> ...
> WAIT #139711646963136: nam='db file scattered read' ela= 3279 file#=5
> block#=61058 blocks=128 obj#=75252 tim=248446281696
> WAIT #139711646963136: nam='db file scattered read' ela= 3115 file#=5
> block#=61186 blocks=126 obj#=75252 tim=248446287141
> WAIT #139711646963136: nam='db file scattered read' ela= 2990 file#=5
> block#=61312 blocks=128 obj#=75252 tim=248446292359
> WAIT #139711646963136: nam='db file scattered read' ela= 3109 file#=5
> block#=61440 blocks=128 obj#=75252 tim=248446297711
> ....
> ...
>
> WAIT #139711646963136: nam='db file scattered read' ela= 3125 file#=5
> block#=7812 blocks=128 obj#=75252 tim=248446303109
> WAIT #139711646963136: nam='db file scattered read' ela= 3246 file#=5
> block#=7940 blocks=124 obj#=75252 tim=248446309728
> WAIT #139711646963136: nam='db file scattered read' ela= 3180 file#=5
> block#=8064 blocks=128 obj#=75252 tim=248446315169
>
> the L1 DBA extent map:
> Auxillary Map
> --------------------------------------------------------
> Extent 0 : L1 dba: 0x01411880 Data dba: 0x01411883
> Extent 1 : L1 dba: 0x01411880 Data dba: 0x01411888
> ....
> Extent 35 : L1 dba: 0x0140ae80 Data dba: 0x0140ae82
> Extent 36 : L1 dba: 0x0140af00 Data dba: 0x0140af02
> Extent 37 : L1 dba: 0x0140af80 Data dba: 0x0140af82
> ...
> Extent 38 : L1 dba: 0x0140b000 Data dba: 0x0140b002
> Extent 39 : L1 dba: 0x0140b080 Data dba: 0x0140b082
> Extent 40 : L1 dba: 0x0140b100 Data dba: 0x0140b102
> Extent 41 : L1 dba: 0x0140b180 Data dba: 0x0140b182
> Extent 42 : L1 dba: 0x0140b200 Data dba: 0x0140b202
> Extent 43 : L1 dba: 0x0140b280 Data dba: 0x0140b282
> ...
> ...
> Extent 92 : L1 dba: 0x01417c80 Data dba: 0x01417c84
> Extent 93 : L1 dba: 0x01418080 Data dba: 0x01418084
> ...
> Extent 94 : L1 dba: 0x0140ee80 Data dba: 0x0140ee82
> Extent 95 : L1 dba: 0x01401e80 Data dba: 0x01401e84
> Extent 96 : L1 dba: 0x01402280 Data dba: 0x01402284
> Extent 97 : L1 dba: 0x01402680 Data dba: 0x01402684
> Extent 98 : L1 dba: 0x01402a80 Data dba: 0x01402a84
>
> When it comes to index range scan things are as expected, as the rowids
> are sorted it reads all the blocks in one arc (i am not talking about
> prefetching, or batched (db file parallel read which exploits IOPS at a
> storage layer but a very simplistic case with all prefetching etc turned
> off) but for full table scans things appear to be different as it simply
> following the extent map and not in an order of their L1 DBA,
>
> maybe this can be explained by something, but I am still not be able to
> figure out any other valid reason as to why we should not perform a full
> table scan based on the storage order extent map with the exception of last
> extent. Please let me know if I have missed anything?
>
> Thanks,
> Vishnu
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 26 2019 - 17:30:22 CET

Original text of this message