FW: concepts document part about separating indexes and tablespaces

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Mon, 19 May 2008 09:30:03 -0400
Message-ID: <037101c8b9b4$730ddf80$1100a8c0@rsiz.com>

<snipped> was too big.

From: Mark W. Farnham [mailto:mwf_at_rsiz.com] Sent: Monday, May 19, 2008 9:07 AM
To: 'patrick.roozen_at_gmail.com'; 'mary_mcneely_at_yahoo.com' Cc: 'oracle-l_at_freelists.org'
Subject: RE: concepts document part about separating indexes and tablespaces  

Some flies for your ointment:  

  1. Are all your applications equally important with regard to uptime? If you have one application that is more important to be up, you need to ask yourself whether you can rebuild the indexes for that application quicker than you can restore all the tables and indexes. If you don't know the answer to that question, then you have decided that thinking about improving beyond your mean time to recover media and roll forward is uneconomic for your site for that particular database. That may indeed be a valid point of view. I have yet to be informed of a site that can justify that stance and is also hurt in any way by separation of data from indexes. (Notice I wrote data: that includes all variety of tables (including index oriented TABLEs) and clusters, and user defined objects that are the source of data.
  2. I find it amusing that the single threaded analysis of an index range scan is conflated with systems being multi-user in a clear demonstration that an individual index range scan requiring table access can only be faster in a carefully constructed case such as a table's data being physically ordered in the table by the most frequently accessed index path with the index being too big to cache. Let's grant Richard's contention that it is unlikely you'll win on performance of an individual index range scan query requiring table access. Now consider dozens of users with queries completely satisfied by indexes not competing with i/o for a full table scan in another tablespace. Unless you're going to argue that a full table scan is never the right thing to do and that none of your queries are satisfied from an index alone, I believe that means you sometimes get a win from separation of data and indexes, because statisitcally you will have less seek. If your underlying disk farm is SAME, you'll minimize that win and protect yourself from the possibility of unbalanced i/o generating sufficient queueing to degrade performance. If you have multiple independent units of i/o to which you can assign separation of batch oriented operation (BORING), then your i/o may indeed be unbalanced, but you also have the opportunity to minimize seek and maximize the throughput of parallel operations processing (not to be confused with parallel degree sql statements). It also becomes your job to avoid overallocation of i/o demand to a particular independent unit of i/o. I've never found that very hard unless the underlying storage farm construction was HAPHAZARD.
  3. Dirty blocks are batched for write. Think about what happens in bursts of inserts.
  4. In many systems the total i/o to indexes is greater than the total i/o to tables. Please don't come back with that straw man; indeed you may need to use more i/o capability for indexes than for data. Your mileage may vary.
  5. I think the interesting question Richard raises is whether adding i/o capacity is better done by statistically adding the i/o bandwidth to all i/o versus adding it in some other way. On that I believe your mileage will vary. Other persons contend that i/o balance is identical to maximum throughput, so adding i/o bandwidth statistically in a statmux SAME fashion will always win. You can only execute that experiment if you have independent units of i/o configured.



PS: as for finding support for any stance in the Oracle documents, they do indeed call for SAME as a best practice, which effectively statistically spreads everything on the underlying disk farm.  


Received on Mon May 19 2008 - 08:30:03 CDT

Original text of this message