RE: index with very high LIO

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 22 Nov 2011 21:43:11 -0500
Message-ID: <080801cca989$a4bfaae0$ee3f00a0$_at_rsiz.com>



Obviously I did not make myself clear. In the case where an FTS wins, a partial FTS should win even bigger.

Omitting the potentially large gaps in an FTS (which the ranges of blocks [not rows] required can be gotten from the index) gives you a less expensive FTS. So a partial FTS would win whenever the cost of determining the block ranges required is less than the cost saved by being a smaller FTS. A similar code path could potentially mitigate the empty front problem as well for non-index related FTS. (In case the "empty front problem" is in any way mysterious that is when completely empty blocks near the beginning of the extent map remain completely empty for long periods due to either being down the freelist or ASSM bitmap allocation stack or because new rows only come in via direct load high or there simply are no new rows coming in after a big delete for long enough to matter. Just as you can construct a block map from an index, an empty block map could be maintained by last row deleted below the high water mark records the row in the map and if you're the first row in an old block you take the row out. Then any old FTS could skip "large enough to yield a benefit" empty block ranges. This would be trickier but more useful than the easier fix of recording a low water mark [which also is apparently insufficiently appealing to implement.])

You're right that if the real problem on this system is that data is moved slowly you want to move less data. This would only win when the FTS is moving less data in the first place.

I don't get where you translated block ranges into rowid access from what I wrote, but just to be clear: I'm not suggesting that at all. I'm suggesting that in many cases significant numbers of block ranges don't need to be visited at all by the FTS.

Regards,

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Greg Rahn
Sent: Tuesday, November 22, 2011 4:33 PM To: Mark W. Farnham
Cc: oracle-l_at_freelists.org
Subject: Re: index with very high LIO

On Tue, Nov 22, 2011 at 7:15 AM, Mark W. Farnham <mwf_at_rsiz.com> wrote:
> It is a pity that there seems to be no way to use an index to grab
> block segments of a table to fts (and yes, then refilter the rows).
>

How does that help here? The system is limited by data movement between strorage and host so retrieving more data would not seem to add any benefit to me. Having just 35MB/s of bandwidth, sending any extra data seems counter productive -- even it it means getting the filter IOPS bound (which I doubt it is).

Just a thought. Might be more effort than value, but I think in the long
> haul everything just gets too big for a full FTS, even in parallel.
> Heh, the software effort would have been more defensible 20 years ago.
>

I'd say it's quite the opposite -- it is faster and more efficient to leverage a parallel sequential scan and filter out rows when it's a "many rows" problem than to get 1 block at a time for a "many rows" problem -- especially if your storage is rotating rust. Access via rowid is great for a "few rows" problem and "few" has an absolute ceiling IMO. This is the same reason that star transformation using bitmap indexes works great if the number of rowid gets is few, but once you cross into the 1 to 5 million range it degrades and a FTS with the appropriate hardware will beat it out. The caveat here is that FTS does not mean read ever single row blindly -- it means encompassing features like partitioning or storage index type functionality that leverages ordering/clustering.

The high level challenge here that I see is the client trying to fit the foot to the shoe, not the shoe to the foot which is resulting in some edge case engineering. As one of my university CS professors often said, "just because it works, doesn't mean it's right."

-- 
Regards,
Greg Rahn  |  blog <http://structureddata.org>  |
twitter<http://twitter.com/GregRahn>  |
 linkedin <http://www.linkedin.com/in/gregrahn>


--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 22 2011 - 20:43:11 CST

Original text of this message