Re: Full table scan -- uniform extent allocation - extent allocation map issue?

From: Vishnu Potukanuma <vishnupotukanuma_at_gmail.com>
Date: Wed, 27 Nov 2019 02:10:37 +0530
Message-ID: <CAP-RywzvOV8Khxpd4s12gqeu94zviWj9KD978ugjLFDS_cqbLQ_at_mail.gmail.com>





Hi Clay,

what you have said is true... just curios about little things and looking at every single possibility....

Thanks,
Vishnu

On Tue, Nov 26, 2019 at 10:51 PM Clay Jackson (cjackson) < Clay.Jackson_at_quest.com> wrote:

> Some random thoughts –
>
>
>
> As other folks have already pointed out, there are several things that
> seem to make this a bit of an “edge case” - Uneven segment sizes and large
> volume of allocation and deallocation. At a recent User Group, one of the
> Oracle PMs offered the thought that one of Oracle’s greatest challenges
> these days was “making then best of really bad situations”. I suspect
> that’s true in this case.
>
>
>
> Consider – doing a full table scan is already a “bad situation”; except in
> OTHER edge cases, like the classic case of a very small, frequently
> accessed table that can be kept entirely in memory. As the original
> writer suggests, the “name of the game” is minimizing “work”. So, given
> that, I think Oracle has “done the right thing” by simply reading blocks in
> allocation order (close to a sequential scan). Doing anything else, like
> trying to read in “L1” order, is ALWAYs going to imply EXTRA work.
>
>
>
> So, IMHO, Oracle’s got it right here.
>
>
>
>
>
> *Clay Jackson*
>
> Database Solutions Sales Engineer
>
> clay.jackson_at_quest.com
>
> *office* 949-754-1203 *mobile* 425-802-9603
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Vishnu Potukanuma
> *Sent:* Tuesday, November 26, 2019 8:30 AM
> *To:* jonathan_at_jlcomp.demon.co.uk
> *Cc:* Oracle L <oracle-l_at_freelists.org>
> *Subject:* Re: Full table scan -- uniform extent allocation - extent
> allocation map issue?
>
>
>
> *CAUTION:* This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
>
>
>
> Hi Jonathan,
>
>
>
> The trace file indicate for autoallocate - yes.. i must have pasted the
> old one when i was testing auto allocate (in auto allocate this issue
> diminishes as the extent sizes become large)
>
>
>
> Oracle allocates extents to a segment at any allocation in the data file
> (uniform size) when the demand for space arises which is logical thing but
> as oracle reads extents in the order they are allocated during the full
> table scans rather than where they are stored, the IO becomes random rather
> than sequential given time when extent sizes are low and the workload
> mainly contains frequent segment allocation and deallocation and that if
> other sessions doing index range scans saturate IOPS at the storage layer
> especially HDDs as these are random in nature, reduces the
> overall throughput of the full table scan...
>
>
>
> I mean when the full table scans are slow, usually we end up looking at
> various things, but could be a factor in very rare cases (OLTP) and could
> be common in warehouses where segments are frequently dropped or created,
> and this random extent allocation can grow right under our nose and we
> never detect it. There are various other scenarios, edge cases i am looking
> at, but i am really interested as to why a session blindly reads the
> extents in order they are allocated rather than in the order they are
> stored based on L1DBA.
>
>
>
> Please give me sometime (have to get dinner before its too late), i will
> send the complete test case with which this thing can be replicated.
>
>
>
> Thanks,
>
> Vishnu
>
>
>
>
>
> On Tue, Nov 26, 2019 at 8:58 PM Jonathan Lewis <
> jonathan_at_jlcomp.demon.co.uk> wrote:
>
> insert into table10 select * From table1;
> drop table1;
>
>
> --- create a tablespace with 350mb in size and uniform extent allocation,
>
> The extent information you've supplied shows us that you've created the
> final table in a tablespace using autoallocate - the early extents are
> 64KB, the next 1MB, the last few you've shown are 8MB.
>
> If you want to supply a test case that has a chance of reproducing then
> you need to tell us WHAT size uniform size, preferably giving us a script
> that allows us change a couple of names (tablespace / file) and create the
> tablespace.
>
> -- create 10 tables with 30 mb in size.
>
> Needs one sample Create table script - did you create it with "initial
> 30M" (obviously not); did you create it with "create as select ...", did
> you do anything odd with pctfree; how did you populate it. Do you need the
> tables to allocate 30MB, or should the highwater mark be pushed to 30MB as
> well.
>
> -- insert into table10 select * From table1;
>
> That's probably going to work - because we'd probably assume you wanted
> all 10 tables to look alike.
>
> -- drop table1;
>
> That should work, but do you have your recyclebin enabled - that may
> affect where the next insert puts the data (especially if some doesn't
> follow your instruction that the tablespace should be 350MB).
>
> -- similarly perform the same randomly
>
> If my results don't match your results, might that be because your results
> are highlighting a flaw in your system, or because I didn't choose the same
> random order as you did ? Don't supply a "random" test if you want to
> demonstrate a reproducible anomaly. Tell us the exact order you used to
> insert and drop.
>
>
> -- interestingly the fix is to read the extents one after the other based
> on the L1 DBA rather than the order in which they are allocated to the
> segment
>
> Are you saying you think this is what Oracle does, or what you think
> Oracle should do ?
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
>
>
>
>
> ________________________________________
> From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on
> behalf of Vishnu Potukanuma <vishnupotukanuma_at_gmail.com>
> Sent: 26 November 2019 14:12
> To: Oracle L
> Subject: Full table scan -- uniform extent allocation - extent allocation
> map issue?
>
> Hi,
>
> I have observed an interesting behaviour today, while working on other
> aspects and my hunch appears to be true ,,, I will start by saying that
> there are too many variables and scenarios where things can be optimal but
> i am talking about the worst case scenario where the segments are
> frequently created and dropped in a tablespace.
>
> This works by exploiting certain aspects of the database such as how
> oracle maintains extent map in the segment header and how the extent
> allocation is done or how the full table scan reads the segment. Inherently
> in this case there is nothing we can do as there are too many variables.
> The impact may be small in terms of the overall response times but this
> typically reduces the lifespan of the disks as the disk spindles ends up
> moving from higher to lower tracks and vice versa over and over again...
> instead of just in a simple arc from higher tracks to lower tracks. in
> worst case this could add delay when the segment is very large and if the
> extent sizes are small (this probability of this occuring is very less but
> there is still a chance).
>
> when it comes to oracle while performing full table scans, the process
> starts by reading the extent map and starts reading extents one by one
> until the last extent or high water mark is reached... but what if the
> extents allocated to the segment are stored in random areas just like
> clustering_factor? interestingly the fix is to read the extents one after
> the other based on the L1 DBA rather than the order in which they are
> allocated to the segment, in most cases, oracle does a pretty good job
> allocating extents close to each other to the segment, but things can get
> messy.
>
> this is the simplest way to explain and this goes gradually from here and
> we can make however we like it.
> create a tablespace with 350mb in size and uniform extent allocation,
> create 10 tables with 30 mb in size.
> insert into table10 select * From table1;
> drop table1;
> similarly perform the same randomly so that the final end result is only
> one table. here we are just simulating a case where the subsequent extents
> allocated to the segment are on different regions randomly in a datafile,
> thus their locations are on the disk as well (which could be or could not
> be based on several factors such as how fragmented the disks are etc, i
> wont go in that space as it adds too much complexity).. here we perform a
> full table scan, oracle reads the segment header to read the extent map and
> reads the extents one by one...
>
>
> Notice the block numbers in the traces.
> WAIT #139711646963136: nam='db file sequential read' ela= 612 file#=5
> block#=71810 blocks=1 obj#=75252 tim=248445129880
> WAIT #139711646963136: nam='db file scattered read' ela= 763 file#=5
> block#=71811 blocks=5 obj#=75252 tim=248445130757
> WAIT #139711646963136: nam='db file scattered read' ela= 771 file#=5
> block#=71816 blocks=8 obj#=75252 tim=248445131732
> ....
> ....
> WAIT #139711646963136: nam='db file scattered read' ela= 1941 file#=5
> block#=44736 blocks=64 obj#=75252 tim=248445270874
> WAIT #139711646963136: nam='db file scattered read' ela= 1788 file#=5
> block#=44802 blocks=62 obj#=75252 tim=248445273696
> WAIT #139711646963136: nam='db file scattered read' ela= 1902 file#=5
> block#=44864 blocks=64 obj#=75252 tim=248445276573
> WAIT #139711646963136: nam='db file scattered read' ela= 1756 file#=5
> block#=44930 blocks=62 obj#=75252 tim=248445279335
> ....
>
> WAIT #139711646963136: nam='db file scattered read' ela= 3078 file#=5
> block#=132 blocks=128 obj#=75252 tim=248445887737
> WAIT #139711646963136: nam='db file scattered read' ela= 3045 file#=5
> block#=260 blocks=124 obj#=75252 tim=248445893065
> WAIT #139711646963136: nam='db file scattered read' ela= 3399 file#=5
> block#=384 blocks=128 obj#=75252 tim=248445898641
> ....
> ...
> WAIT #139711646963136: nam='db file scattered read' ela= 12269 file#=5
> block#=97540 blocks=124 obj#=75252 tim=248446182677
> WAIT #139711646963136: nam='db file scattered read' ela= 17440 file#=5
> block#=97664 blocks=128 obj#=75252 tim=248446202405
> WAIT #139711646963136: nam='db file scattered read' ela= 3282 file#=5
> block#=97792 blocks=128 obj#=75252 tim=248446208027
> ...
> ...
> WAIT #139711646963136: nam='db file scattered read' ela= 3279 file#=5
> block#=61058 blocks=128 obj#=75252 tim=248446281696
> WAIT #139711646963136: nam='db file scattered read' ela= 3115 file#=5
> block#=61186 blocks=126 obj#=75252 tim=248446287141
> WAIT #139711646963136: nam='db file scattered read' ela= 2990 file#=5
> block#=61312 blocks=128 obj#=75252 tim=248446292359
> WAIT #139711646963136: nam='db file scattered read' ela= 3109 file#=5
> block#=61440 blocks=128 obj#=75252 tim=248446297711
> ....
> ...
>
> WAIT #139711646963136: nam='db file scattered read' ela= 3125 file#=5
> block#=7812 blocks=128 obj#=75252 tim=248446303109
> WAIT #139711646963136: nam='db file scattered read' ela= 3246 file#=5
> block#=7940 blocks=124 obj#=75252 tim=248446309728
> WAIT #139711646963136: nam='db file scattered read' ela= 3180 file#=5
> block#=8064 blocks=128 obj#=75252 tim=248446315169
>
> the L1 DBA extent map:
> Auxillary Map
> --------------------------------------------------------
> Extent 0 : L1 dba: 0x01411880 Data dba: 0x01411883
> Extent 1 : L1 dba: 0x01411880 Data dba: 0x01411888
> ....
> Extent 35 : L1 dba: 0x0140ae80 Data dba: 0x0140ae82
> Extent 36 : L1 dba: 0x0140af00 Data dba: 0x0140af02
> Extent 37 : L1 dba: 0x0140af80 Data dba: 0x0140af82
> ...
> Extent 38 : L1 dba: 0x0140b000 Data dba: 0x0140b002
> Extent 39 : L1 dba: 0x0140b080 Data dba: 0x0140b082
> Extent 40 : L1 dba: 0x0140b100 Data dba: 0x0140b102
> Extent 41 : L1 dba: 0x0140b180 Data dba: 0x0140b182
> Extent 42 : L1 dba: 0x0140b200 Data dba: 0x0140b202
> Extent 43 : L1 dba: 0x0140b280 Data dba: 0x0140b282
> ...
> ...
> Extent 92 : L1 dba: 0x01417c80 Data dba: 0x01417c84
> Extent 93 : L1 dba: 0x01418080 Data dba: 0x01418084
> ...
> Extent 94 : L1 dba: 0x0140ee80 Data dba: 0x0140ee82
> Extent 95 : L1 dba: 0x01401e80 Data dba: 0x01401e84
> Extent 96 : L1 dba: 0x01402280 Data dba: 0x01402284
> Extent 97 : L1 dba: 0x01402680 Data dba: 0x01402684
> Extent 98 : L1 dba: 0x01402a80 Data dba: 0x01402a84
>
> When it comes to index range scan things are as expected, as the rowids
> are sorted it reads all the blocks in one arc (i am not talking about
> prefetching, or batched (db file parallel read which exploits IOPS at a
> storage layer but a very simplistic case with all prefetching etc turned
> off) but for full table scans things appear to be different as it simply
> following the extent map and not in an order of their L1 DBA,
>
> maybe this can be explained by something, but I am still not be able to
> figure out any other valid reason as to why we should not perform a full
> table scan based on the storage order extent map with the exception of last
> extent. Please let me know if I have missed anything?
>
> Thanks,
> Vishnu
> --
> http://www.freelists.org/webpage/oracle-l
> <https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.freelists.org%2Fwebpage%2Foracle-l&data=02%7C01%7Cclay.jackson%40quest.com%7C5f05abd5c0894eb59b0108d7728e1318%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C637103826849761206&sdata=gFpqKs1cKbH7KPEQPvX41RzD%2BtvxBcqpDXhH6wy%2BK3w%3D&reserved=0>
>
>



--
http://www.freelists.org/webpage/oracle-l


Received on Tue Nov 26 2019 - 21:40:37 CET

Original text of this message