Re: IOT, memory and transaction time

From: <deja_at_2bytes.co.uk>
Date: 17 Apr 2007 13:33:50 -0700
Message-ID: <1176842030.099001.6260@b75g2000hsg.googlegroups.com>

On Apr 17, 4:50 pm, Robert Klemme <shortcut..._at_googlemail.com> wrote:
> On 17.04.2007 15:56, d..._at_2bytes.co.uk wrote:
>
>
>
> > On Apr 17, 11:34 am, d..._at_2bytes.co.uk wrote:
> >> On Apr 17, 10:39 am, Robert Klemme <shortcut..._at_googlemail.com> wrote:
>
> >>> On 17.04.2007 11:26, d..._at_2bytes.co.uk wrote:
> >>>> We have a table with 4 columns - col1 int, col2 int, col3 char(32),
> >>>> col4 char(32).
> >>>> Previously there were 2 indexes
> >>>> unique index1 on col3, col1
> >>>> index2 on col2, col1
> >>>> Reads always read all columns.
> >>>> We have recently changed the way this table works. The columns are all
> >>>> the same but because of the way we put data in we have changed the
> >>>> indexes.
> >>> Please provide complete DDL for old and new table layout. You can
> >>> easily generate it with DBMS_METADATA.GET_DDL if your verison of Oracle
> >>> includes it.
> >>>> Now the table is an Index Ordered table with col2,col3 as PRIMARY KEY.
> >>> IOT = Index Organized Table.
> >>>> There is an additional index on col3,col2.
> >>> Why do you put another index on the same set of columns?
>
> Why?
>
>
>
> >>>> Reads have improved quite a lot but there is a problem with the
> >>>> writes. Previously, when adding 40,000 records at a time, write times
> >>>> were pretty consistent no matter how big the table (this transaction
> >>>> used to actually involve 200 updates of the same table followed by
> >>>> 40,000 inserts). With the new indexes, they get progressively worse
> >>>> with every write plateauing at around 10 minutes!! (this new
> >>>> transaction involves updating a dummy table with one record and one
> >>>> column 200 times (using a where clause that references the other
> >>>> table), followed by 40,000 inserts). The update part of the
> >>>> transaction basically checks that no-one else has updated since you
> >>>> read.
> >>>> Why have write times changed so dramatically? Why do they get
> >>>> progressively worse? If it was the update statement that references
> >>>> the other table with a where clause, then why have Read times improved
> >>>> (using pretty much the same clause)? So I figured it probably wasn't
> >>>> the Update statement. So, looking at the Insert, why has this become
> >>>> so different. The columns are the same, it is just that one is Index
> >>>> Ordered and one isn't. The 40000 inserts will all have a col2 value
> >>>> that is the same but is greater than any previous write of col2. The
> >>>> col3 values may not be in order when written back - so is it just a
> >>>> memory issue? If we allocate more memory to the Oracle instance will
> >>>> that solve it? Does the memory have to be allocated to a specific area
> >>>> or is the SGA sufficient?
> >>> My first guess would be that because of the changed physical structure
> >>> inserts are generally more complex and generate more IO.
> >>> Kind regards
> >>> robert
> >> Thanks Robert,
>
> >> Here is the DDL...
>
> >> New DDL:
> >> CREATE TABLE R_TABLE1 (
> >> deleted INTEGER NOT NULL,
> >> versionNo INTEGER NOT NULL,
> >> Id char(32) NOT NULL,
> >> Rel char(32) NOT NULL,
> >> CONSTRAINT R_TABLE1_1 PRIMARY KEY(versionNo,Id, Rel),
> >> FOREIGN KEY (deleted) REFERENCES T_TRANSACTIONS (TransNum),
> >> FOREIGN KEY (versionNo) REFERENCES T_TRANSACTIONS (TransNum),
> >> CHECK (versionNo >= deleted))
> >> ORGANIZATION INDEX;
>
> >> CREATE INDEX R_TABLE1_2 ON R_TABLE1 (Id, versionNo);
>
> >> Old DDL:
> >> CREATE TABLE R_TABLE1(
> >> deleted INTEGER NOT NULL,
> >> versionNo INTEGER NOT NULL,
> >> Id char(32) NOT NULL,
> >> Rel char(32) NOT NULL,
> >> FOREIGN KEY (deleted) REFERENCES T_TRANSACTIONS (TransNum),
> >> FOREIGN KEY (versionNo) REFERENCES T_TRANSACTIONS (TransNum),
> >> CHECK (versionNo > deleted));
>
> >> CREATE UNIQUE INDEX R_TABLE1_1 ON R_TABLE1 (Id, deleted, Rel);
> >> CREATE INDEX R_TABLE1_2 ON R_TABLE1 (versionNo, deleted);
>
> >> There are 2 basic reads on this table.
> >> 1) the latest version of a record
> >> 2) all the changes since the last read version
>
> >> In the old system this meant
> >> 1) select * from r_table1 where deleted = -1 (because every time a
> >> change was made, it would update the deleted column of the existing
> >> version = the current transNum and then insert a new version with
> >> deleted = -1)
> >> AND
> >> select * from r_table1 where deleted between 0 and current transNum
> >> (just in case someone made changes between you starting your read and
> >> ending it - consistent read of numerous tables)
>
> That sounds strange. Oracle provides read consistency and SQL Server
> does a similar thing. The whole point of transactions is that you can
> apply multiple changes consistently. There are however differences how
> both products deal with concurrency; in Oracle readers don't get blocked
> while this may happen in SQL Server.
>
> >> 2) select * from r_table1 where versionNo > @lastRead and versionNo <=
> >> @currentTransNum
> >> AND
> >> select * from r_table1 where versionNo > currentTransNum and deleted >
> >> @lastRead and deleted <= currentTrans (again for consistent read)
>
> >> In the new system it just involves one read for each because no update
> >> takes place, a new version just supercedes the old version by virtue
> >> of a bigger versionNo:
>
> So you did not only change DDL but also the logic how data is inserted
> and dealt with. I was not aware of that.
>
> >> 1) select * from r_table1 t1 where versionNo = (select max(versionNo)
> >> from r_table1 t2 where t1.id = t2.id and t2.versionNo <=
> >> @currentTransNum) and deleted = -1
>
> Btw, is this in a stored procedure? Then a fixed plan might hit you.
>
> Other than that it seems to me that this query would benefit from an
> clustered index on (versionNo, id, delete).
>
> >> 2) select * from r_table1 t1 where versionNo = (select max(versionNo)
> >> from r_table1 t2 where t1.id = t2.id and t2.versionNo > @lastRead and
> >> t2.versionNo <= @currentTransNum)
>
> same here
>
>
>
> >> In the old system the write transaction used to do:
> >> 1) Update R_TABLE1 Set deleted = versionNo, versionNo =
> >> @currentTransNum where id = @id and deleted = -1 and versionNo =
> >> @versionNo
> >> ONCE FOR EACH value of Id
> >> (If 1 record updated then no-one has updated since our read)
> >> 2) Insert into R_TABLE1(deleted, versionNo, Id, Rel) Values (-1,
> >> @currentTransNum, @id, @rel)
> >> for as many relationship tuples as required.
>
> >> In the new system the write transaction does this:
> >> 1) Updates DUMMY Set Col1 = 1 WHERE @versionNo = (SELECT
> >> MAX(versionNo) from R_TABLE1 where id = @id and versionNo <
> >> @currentTransNum)
> >> (If 1 record updated then no-one has updated since our read)
> >> 2) Same insert as old one above.
>
> I'd rather design the system in a way that a new version requires
> exactly one insert - not more. The meaning of "Col1" is totally unclear
> to me.
>
> >> As far as secondary index is concerned, the application goes out in an
> >> Oracle and a SQL Server flavour. In SQL Server this additional index
> >> seemed to improve performance being used in some of the subselects.
>
> If it helps only on SQL Server then create it only on SQL Server.
> Oracle != SQL Server - as you might have guessed. :-)
>
>
>
> >> The problem is, that the new version of our schema does not perform
> >> like this under SQL Server, only under Oracle. Obviously in SQL Server
> >> we use a CLUSTERED INDEX but as far as I am aware that is just the
> >> same as an IOT.
>
> >> Thanks for any advice
> >> Phil
>
> > Can anyone help with this?
>
> > The Oracle results are not consistent with what we are seeing on SQL
> > Server....
>
> > Basically the results are like this:
>
> > Old SQL Server vs New SQL Server : new SQL Server better on reads,
> > better on writes. All statistics are consistent.
> > Old Oracle vs New Oracle : new Oracle reads are better, new Oracle
> > writes start off better but quickly deteriorate to very poor results.
>
> Oracle does not automatically update statistics. You may have to give
> it some data and update stats again. Watch the plans. If you are on
> 10g that is fairly easy. Otherwise in SQL Plus "set autotrace on".
>
> > However both new Oracle and new SQL Server have had their schemas
> > changed in exactly the same way.
>
> From a certain point of view this is impossible as SQL Server and
> Oracle are quite different. I'd first try to get the logic and schema
> of the table right and then tune it per DB.
>
> > And in the tests Old Oracle and new Oracle were run on exactly the
> > same Oracle configuration while old SQL Server and new SQL Server were
> > run on exactly the same configuration.
>
> > The evidence leads me to think that new schema requires more memory/
> > space and the SQL Server configuration was set up in both cases with
> > enough whereas the Oracle configuration was set up with enough for the
> > old version but not enough for the new.......
>
> > Is there another conclusion?
>
> I don't have a conclusion to offer. I still feel that the problem is
> not yet understood fully.
>
> Regards
>
> robert

don't understand what's happened to my reply on this so I'll try again (I tried to answer near your original comment but will try answering the whole thing here)

> That sounds strange. Oracle provides read consistency and SQL Server
> does a similar thing. The whole point of transactions is that you can
> apply multiple changes consistently. There are however differences how
> both products deal with concurrency; in Oracle readers don't get blocked
> while this may happen in SQL Server.

Well, no RDBMS will provide a really consistent read without making it serialized but that obviously reduces concurrency. Basically if Writer1 modifies Table1 and Table2 in the same transaction but hasn't committed, then Reader1 comes along and reads Table1, then Writer1 commits, then Reader1 reads Table2, they will get an inconsistent view of things. Table1 mods and Table2 mods are all part of the same transaction but Reader1 only sees part of them.

The 2 phase read in the old system deals with this but because of the updates to "old versions", in suffers blocks in SQL Server. The new system will not only eliminate the blocks under SQL Server but also enable a single read.

> So you did not only change DDL but also the logic how data is inserted
> and dealt with. I was not aware of that.

Sorry about that. Yes, logic has changed to avoid the updates and enable single read. Also has another advantage when reading changes since last read., Old system would return all changes, new system returns the latest change per record.

> Btw, is this in a stored procedure? Then a fixed plan might hit you.
>
> Other than that it seems to me that this query would benefit from an
> clustered index on (versionNo, id, delete).
>

Not a stored procedure although I actually thought it would improve things if we changed to that, having a cached query plan.

When you say "clustered index" you are not using the SQL Server terminology (ie Oracle IOT)? I don't know much about Oracle clustered indexes - what's the DDL? What is the difference between this and a normal index?

As it happens I don't think there will be any records where deleted = -1. This column is used more for the object tables not the relationship tables and becomes > -1 when an object is truly deleted not just when a later version is created. I can't think of a situation where this is true for a relationship although, if it is true, there will be maybe 1 record per object which will be minimal compared to the many thousands of records where deleted = -1.

> I'd rather design the system in a way that a new version requires
> exactly one insert - not more. The meaning of "Col1" is totally unclear
> to me.

New version having one insert may indeed be better but of course this is a legacy app that has had a lot of time and money invested in it. I can't go changing it to that degree now as it would mean too much client code changes. Whenever a relationship changes, a whole new relationship version supercedes it. This means if a list of 20 relationships has 1 change, the whole list of 20 relationships is superceded by a new list including mostly the same ones as before. Hey - I didn't design this bit!

DUMMY is just a basic table with an integer column. It is designed to be like DUAL I guess, a 1 record, 1 column table whose only use is to test the rowcount of the update. In the old system we would update the superceded version but we don't do that any more.

> If it helps only on SQL Server then create it only on SQL Server.
> Oracle != SQL Server - as you might have guessed. :-)

Absolutely. I have already planned a test where this index is removed.

> Oracle does not automatically update statistics. You may have to give
> it some data and update stats again. Watch the plans. If you are on
> 10g that is fairly easy. Otherwise in SQL Plus "set autotrace on".

Maybe this is a problem though I would have expected it to impact old Oracle too. the "set autotrace on" command is to set up automatically updating statistics?

> From a certain point of view this is impossible as SQL Server and
> Oracle are quite different. I'd first try to get the logic and schema
> of the table right and then tune it per DB.

No can do.Too much time and money has been invested in the client code. If designing from scratch I would have a Data Access Layer that dealt with the business objects differently depending on the data store but it is more tughtly coupled than that. I can change indexes etc but cannot change the column and table definitions (for "cannot" read "will not be allowed" :o))

Any hints and pointers gratefully received

Phil Received on Tue Apr 17 2007 - 15:33:50 CDT