Re: Separating data, index objects

From: Holger Baer <holger.baer_at_science-computing.de>
Date: Wed, 06 Jul 2005 16:36:23 +0200
Message-ID: <dagq98$sta$1@news.BelWue.DE>

Paul wrote:
>
> Holger Baer <holger.baer_at_science-computing.de> wrote:
>
>
>
>>>Forgive me for jumping in here, but *_a priori_*, it seems obvious
>>>that it *_should_* serve such a purpose.
>
>
>
>>As soon as you start a sentence with it seems obvious, it *is* obvious
>>that it *isn't* obvious. In other words: what you just said is wrong.
>
>
>
> I perhaps should have used the word "counterintuitive". It seems
> counterintuitive to me that separating the index from the actual data
> on different disks should make no impact on performance.
>
>
> I'm trying to understand why it doesn't - i.e. I believe Daniel Morgan
> and I believe you that it doesn't affect performance.

Read Richards post in this thread - I can't make it any clearer than that.
>
>
>
>>>What is the mechanism by which Oracle prevents this from happening?
>>
>>How are tables and indeces used?
>
>
>>>Here, we are talking about tableX with indexX, but it does still make
>>>sense to separate different tables and indices to avoid contention -
>>>or is this not correct either?
>
>
>>No. You separate *segments* to spread I/O. If the segment in question
>>is an index or a table has nothing to do with it.
>
>
>
> OK, so now here's where my understanding is breaking down. I thought
> that the DBA created tablespaces sufficiently large to hold the data
> (+ a bit more, just to be sure!), and that management of Segments and
> Extents was done internally within the instance itself.

A (heap) table is a segment. An index is a segment. A cluster is a segment. A partitioned table is a table spread across multiple segments.

Separating segments by type (=table should be separate from its indices) instead of separating them by usage (this is a heavyly used segment with lots of physical IO, and this one too) is what I'm fighting against.

>
>
>
>>And if you're not interested in the last 5% of possible performance,
>>then SAME might very well serve this purpose.
>
>
>
> I looked up the SAME keyword in both Oracle Essentials and Oracle SQL
> and I can't find a reference to this or in Tom Kyte's book (Effective
> Oracle by Design).

SAME=Stripe And Mirror Everything

>
>
>
>>You know, the error most people make regarding this topic, the don't
>>think how an index is used.
>
> OK, fine - I am more than willing to learn about the internals and am
> willing to put in the effort to chase up URLs or do Googles or
> whatever it takes.
>

The Concepts Guide which can be found at tahiti.oracle.com should get you started.

>
>
>>An index is either
>>a) a mechanism to quickly retrieve a block from a table, or
>
>
>
> Aha! Now *_this_* I do understand! 8-)
>
>
>
>>b) it can be used as a skinny version of a table.
>
>
>
> IOT's - yes? Am I even warm?

Not even remotely ;-) If a query can be answered by an existing index, then Oracle will not read the corresponding table.

>
>>If we're talking about a), then any session using an index will do so
>>first and then retrieve the table block. So why should tables and indexes
>>contend inherently more than tables and tables or indexes or indexes if
>>we throw in multiuser?
>
>
>
> Say we have an app (I recently wrote a membership system for An Óige
> (the Irish Youth Hostel Organisation)), whose budget didn't stretch to
> Oracle, but let's take it as an example.
>
> It was very simple - there was a Member table where the vast bulk of
> the data was stored, and then various other tables which were foreign
> keys into the Member table (title (Mr, Mrs, ...), County (Dublin...))
>
> Now, this system receives approx 30 - 40 records a day, so we had no
> problems with performance, I'm happy to say.
>
> However, imagine a scenario where this app was constantly being
> accessed, updated and added to by thousands of concurrent users. The
> main Member table being the one by far the most used.
>
> Now, under a multi-concurrent user model, users are constantly
> referring to records in the Member table, by using the indices.
>
> Now, if the table and the index are on different disks, then user1 can
> read the index, system goes off to search and retrieve the data, but
> while that's happening user2 can search the index, because there's no
> contention on the disks, them being on different disks.

Again, read what Richard has to say.

>
> My understanding of why it might be useful to separate data and
> indices. Why have tablespaces at all, if not for that reason? I
> believe, though am not certain, that PostgreSQL introduced a
> tablespace feature in their latest version 8 and separating indexes
> and tables was touted as an advantage. AFAIK, PostgreSQL uses a
> record-versioning (record shadowing) MVCC model similar to Oracle's?

No. Tablespaces exist for mainly two reasons: a) To overcome size limitations of the filesystem (a file size limit of 2 GB

is not that long history)
b) to spread IO across available spindles

I don't know why the separation of tables and indexes in the Postgres world should be an advantage, but then I'm near to ignorant in that regard. But what is true for one Database is almost certainly not true for another.

>
>
> Please don't take what I have written above as being my definitive
> statement of the facts - it is my understanding, and I am more than
> willing to be corrected if I am wrong.
>

Hope I could make myself clear - and I'm not stating facts but my understanding, too.
>
>>If we're talking about b), then there is no table access at all and the
>>whole discussion is moot.
>
>
>
> IOT's, you get the data with the index, yes?

If an IOT has an overflow segment, then partly the data is in a separate segment....

Imagine a really fat table with heaps of columns and tons of rows with an index on just a few columns and incidentally those columns contain exactly the information you want to know, why should Oracle bother with the table at all? (And in fact, it doesn't).

Example:
SQL> create table demo as select * from all_objects;

Table created.

SQL> alter table demo modify (owner not null, object_name not null, object_type

not null);

Table altered.

SQL> create index idx_owner on demo (owner);

Index created.

SQL> create index idx_object_type on demo (object_type);

Index created.

SQL> create index idx_own_obj on demo (owner, object_type);

Index created.

SQL> exec dbms_stats.gather_schema_stats (user, cascade => true);

PL/SQL procedure successfully completed.

SQL> set autotrace on
SQL> select count (*), object_type from demo group by object_type;

[...]

Execution Plan

    0      SELECT STATEMENT Optimizer=CHOOSE (Cost=92 Card=21 Bytes=210
           )

    1    0   SORT (GROUP BY) (Cost=92 Card=21 Bytes=210)
    2    1     INDEX (FAST FULL SCAN) OF 'IDX_OBJECT_TYPE' (INDEX) (Cos
           t=17 Card=36486 Bytes=364860)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           No table access in the execution plan.

HTH
Holger Received on Wed Jul 06 2005 - 09:36:23 CDT