Oracle-L: Re: Recipe for application design to run on RAC

From: Tim Gorman <Tim_at_SageLogix.com>
Date: Wed, 04 Dec 2002 20:33:43 -0800
Message-ID: <F001.00512C5E.20021204203343@fatcity.com>

comments inline...

Original Message ----- To: "Multiple recipients of list ORACLE-L" <ORACLE-L_at_fatcity.com> Sent: Wednesday, December 04, 2002 8:53 AM

> Whoa! Tim, thanks a lot for sharing this. Quite an
> insight.
>
> So SELECTs are not a concern.
>

Well, not directly. They do not directly cause buffers to move around. But they can cause a PCM lock to be downgraded from "exclusive" to "shared", thus forcing the instance which had the lock in exclusive-mode to request that it be returned to "exclusive". Thus, while the block doesn't leave the Buffer Cache while it's lock is downgraded, it still induces some fiddling back and forth between the instances...

>
> INSERTs are a "come and see DBA" thing (physical design issue).
>

Yes. Prior to 9i, the mechanisms to use are FREELIST GROUPS. Very much eliminates inter-instance contention during INSERTs in OPS...

Though I haven't had a chance to play with it yet, the bitmap-oriented 9i replacement for freelists and freelist groups, called "automated segment-space mgmt" or ASSM, is apparently still only half-baked, purportedly producing all kinds of unexpected results in space wastage and other things. So, in 9iRAC, you might still want to consider using FREELIST GROUPS over ASSM. Again, just my uninformed opinion based on hearsay...

>
> DELETEs are relatively infrequent and many get
> translated into UPDATE (logical as opposed to physical delete).
>

Well, both DELETEs and UPDATEs have the same characteristics from a "cache-coherency" perspective, so it's six-of-one, half-dozen-the-other...

>
> Application "partitioning" as you clearly explained in
> your email... Would it be closer to a logical or
> physical design?
>

I've always tried to use the word "segregation" as opposed to "partitioning", though I slip up occasionally. The word "partitioning" makes people think about the Partitioning option, which is definitely not intended. There is no relationship between Oracle's Partitioning option and the type of "application segregation" I'm trying to describe.

There are two ways to avoid the latency resulting from OPS "pinging" or RAC "cache-fusion": by happenstance or by planning.

Well, actually there are three ways: use OPS/RAC on OpenVMS and neither OPS "pinging" nor RAC "cache-fusion" will result in latency. But let's assume that is not an option for you and consider just the other two ways...

By "happenstance", I mean just hoping that relatively random activity from multiple instances against the same datafiles avoids two (or more) instances wanting the same block for insert, update, or delete. This is pretty rare, but I'm sure it can happen. After all, even a blind dog finds a bone occasionally...

By "planning", essentially you want your application to somehow enforce that sessions on a database instance only UPDATE or DELETE rows that were INSERTed by that instance. That way, the block buffers are never "pinged" or "cache-fusion shipped" to another database instance. There may be some fiddling of the parallel cache-management (PCM) locks if other instances want to read those blocks, but that is less of a concern. So, however your application logic or business practices can ensure that blocks are UPDATEd or DELETEd by the instance from which they were INSERTed, that is what is necessary. Perhaps you can dedicate certain database sessions to specific groups of data (i.e. application module or groups of customers). That doesn't necessarily work all the time; take Oracle Apps as an example, where all application modules inevitably meet in the Application Object Library (AOL) and Foundation (FND) schemas. The surest way I've seen to "segregate" parts of an application is by making use of "data routing" capabilities in the middle-tier application-server or transaction-processing monitor layer. If the middle-tier is capable of data-routing, then you can identify each user transaction by the data values and route the transaction to a session connected to one database instance or the other. This is the surest way to accomplish perfect "segregation" of different database blocks to different instances, when the end-users can't do it. This is usually the case with interactive, OLTP environments.

Of course, another way to route transactions is by forcing such rules of "application segregation" during INSERT, UPDATE, and DELETE by careful data-routing during batch processing. This is the way that it can be implemented for data warehouses...

>
> Seems like something that data modeler/architect
> should be aware of. So in a sense all modeler needs to
> worry about is UPDATEs as far as future physical
> implementation for RAC is concerned?
>

Both should be aware, but the decision to include middleware capable of data-routing when designing an OLTP application is usually up to the architect more than the data modeler, I think. Setting up batch processes to perform the data-routing functionality for DW applications is usually up to application developers, I think. The data modeler might have some say into the use of FREELIST GROUPS or ASSM, but I suspect the DBA has more to say on such matters...

This is the crux of the problem with packaged software on OPS/RAC. The architect has long ago made his/her decisions, as have the application developers. The die is cast. All that is left is the DBA onsite modifying FREELIST GROUPS or whatnot, trying to optimize what little they can...

>
> The reason I get stuck on "phys vs logical" here is
> because client I am with has a clear separation
> between the two.
> It's not only different people that deal with it, but
> in fact different vendors.
>
>
> <soapbox>
> Some background (I probably should've included it in
> my original post):
> Mission critical system to replace around 30 small
> in-house developed apps and do some business
> re-engineering as we go :(
> Data modeling is done by one vendor, development by
> another (don't ask), DB support and maintenance are
> left for internal DBA staff

you poor bastard

>
> One of the conditions that CTO office "mentioned" to
> the modeling company is to keep HA requirements (not
> really defined yet - but that's another story) in
> mind.
> Ok, they turn around bring the data model and declare
> that they not only "kept it in mind", but in fact
> their model is "RAC aware".
>

Data models cannot be "RAC aware"; only the table and index attributes such as FREELIST GROUPS and ASSM, which only address INSERT activity. For addressing UPDATE and DELETE activity, only the application code can be "RAC aware"...

One possibility is that they intend to use RAC in a pure "failover" situation (i.e. one "active" primary instance and one "standby" secondary instance, a.k.a. "active/passive"). In that situation, all of these complications that I'm describing become moot. There is no cache-coherency issues or PCM lock management issues if only one database instance is active and all other instances are completely passive. That "active/passive" configuration coupled with the "transparent application failover" features of SQL*Net could count (in the marketing world) as "RAC-aware", I suppose. But, as you may have noticed, none of this stuff has anything to do with data modeling, per se... :-)

>
> Well I can't describe how happy damanagement is - such
> a successful choice they made (to pick this particular
> company)!
> But the curious side of me wonders how this data model
> is different from the one designed for a single node
> DB?
> </soapbox>
>

I don't think it is possible, but I'd love to learn...

>
>
> And another thing. Tim, you explained clearly how
> application should be assessed for non-RAC to RAC
> migration.
> In my case however application exists mostly on paper
> (not taking into account these 30 micky mouse apps -
> the new system suppose to cover much more than that).
>
> 1) I guess simulation would be one way to estimate
> SQL statements of the app and make a decision on
> whether it can scale on RAC well or not.
> (And BTW simulation might be worth the trouble
> irrespective of whether we use RAC or we don't)
> But frankly so far I've seen prototyping or simulation
> for the sake of sizing, capacity planning,
> understanding DML and query profiles, critical tx,
> memory footprint required etc, etc... only in James
> Moorle book :(
> That is not prototyping for the sake of "getting
> client involved at the early development stages" -
> that's been done with all new apps here. And not
> benchmarking of already existing application, but the
> one that hasn't been developed yet.
>
>
> 2) More importantly simulation is obviously the most
> expensive of all alternatives. Analytical modeling of
> some sort would be much more welcomed by damanagement.
>
> What would be your take on this? How new essentially
> designed from scratch app can be assessed if it can
> scale well on RAC?
>

Only by simulation, which is expensive in its own right. Mercury Interactive has some terrific tools, most costing in the six-figure range. Your company must be slavishly devoted to success to spend that kind of money and time (another word for "money") on load-testing (to be distinguished from functionality-testing)...

Another question becomes: what if load-simulation obtains a negative result? In other words, you determine that the application(s) cannot scale. Can you push back on your vendors? You really can't have that kind of "throw-it-over-the-wall" situation, where one vendor designs in a vacuum, another codes in a vacuum, and then you guys are left to see if the whole mish-mash works in RAC? The designers have to help design the test documentation, as do the application developers, and they have to be available to correct problems found during testing.

How about this? It sounds like you might be trying to use RAC more for high-availability purposes rather than scalability. After all, you mentioned 30 "smaller but mission-critical applications". Of course, performance scalability is not trivial, but the impetus here might possibly be high-availability or something like that?

If so, I would suggest using what I call "mutual failover" or N-way "active/passive". Have all 30 or so applications within one big happy RAC database (or divvy them up between several clusters, if necessary), each in their own separate schema/accounts/users. Have 15 of the applications connect primarily to one of the database instances and have the other 15 connect primarily to the other instance. Since each application is using different schemas and thus different database objects (even different tablespaces?), there will be minimal or zero cache-coherency issues. Should one node fail, then all 30 applications reside within the same database instance, presumably operating in reduced capacity.

If you have a three-node cluster, then you can divvy the apps up 10-10-10. If you have a four-node cluster, then you divvy them up 8-7-8-7 or some such.

In this way, most of the concerns are eliminated.

This has scalability benefits, too. You are distributing the load fairly evenly amongst the nodes. If a load disparity can be identified and you want to shift users to another node, then modify the TNS connections and eventually everyone will re-connect on the other node.

Of course, such database consolidation has lots of downsides (i.e. app ABC needs v8.1.7.4, app XYZ needs 9.2.0.2, app IJK needs 8.1.6.2, etc)...

>
> Thanks again for your help!
>
>
> --- Tim Gorman <Tim_at_SageLogix.com> wrote: > To be
> more precise, the real problems in
> > application-partitioning for
> > OPS/RAC are UPDATE, SELECT ... FOR UPDATE, and
> > DELETE statements due to
> > their WHERE clauses...
> >
> > A SELECT statement does not force exclusive access
> > to a database block and
> > so does not directly cause contention for a block in
> > OPS/RAC. An INSERT
> > statement also does not cause contention on tables
> > because the use of
> > FREELIST GROUPS can keep blocks utilized by
> > different instances separate
> > from one another. If the INSERT is inserting a
> > monotonically-ascending data
> > value into an associated index, then contention on
> > the highest-value leaf
> > block can be reduced by using REVERSE indexes. If
> > the INSERT is not
> > inserting monotonically-ascending data values into
> > any indexes, then
> > contention during parallel cache management between
> > indexes should be
> > minimal.
> >
> > So SELECTs are inherently benign and INSERT
> > operations can be controlled
> > with mechanisms to prevent inter-instance contention
> > for blocks.
> >
> > It is the UPDATE, SELECT ... FOR UPDATE, and DELETE
> > statements which truly
> > require consideration in making an OPS/RAC-based
> > application scaleable. If
> > these statements, which operate on database blocks
> > according to their
> > respective WHERE clauses generated by application
> > logic, do not have some
> > form of "awareness" of assignment of certain data
> > values to specific
> > database instances, then one can expect problems in
> > scaling. Neither OPS
> > nor RAC has any mechanism to minimize contention
> > between instances for block
> > buffers (as with INSERT statements), so it is up to
> > the application itself
> > (which controls the generation of the WHERE clause)
> > to "segregate" the
> > application somehow. Whether it is by major
> > application module (i.e. "sales
> > and marketing" versus "order entry and inventory"
> > versus "general ledger",
> > etc) as Boris had illustrated, or by some other
> > mechanism (i.e. all
> > customers whose names start with A-M on one node,
> > all whose names start with
> > N-Z on the other node, etc), the application must be
> > able to partition.
> >
> > ---
> >
> > In Oracle7 OPS and Oracle8 OPS and Oracle8i OPS, the
> > mechanism (i.e.
> > "pinging" performed through the I/O subsystem) was
> > quite slow on most
> > platforms, resulting in huge latencies. The major
> > exception to this rule
> > was DEC/Compaq/HP OpenVMS, where the performance of
> > the "pinging" mechanism
> > is so fast as to be quite unnoticeable. Not
> > surprising if one considers the
> > history of VMS and OpenVMS...
> >
> > Beginning with pieces of the cache-coherency
> > mechanisms in Oracle8i OPS and
> > fully implemented in Oracle9i RAC, the
> > "cache-fusion" mechanism still
> > performs the same locking and data-transfer of
> > database block buffers
> > between instances, only faster. How much faster is
> > dependent on the OS and
> > configuration. But the additional latency is still
> > there. Obviously, if
> > the inter-connect mechanism between nodes is not
> > fast or misconfigured, then
> > "cache-fusion" cannot be fast either...
> >
> > ---
> >
> > When Oracle states that "applications can be
> > migrated to RAC without
> > modification", they are saying so in the faith that
> > the reduced latencies in
> > the cache-fusion mechanism and other improvements in
> > the
> > sharing/modification of global enqueues will result
> > in almost-zero latency,
> > or at least latency that is within the tolerance of
> > the end-users. As the
> > old saying goes, "your mileage may vary" or YMMV.
> > As OpenVMS and its near
> > zero-latency "pinging" mechanism shows, the choice
> > and configuration of
> > platform really matters also!
> >
> > ---
> >
> > In order to assess if an application is likely to
> > scale effectively when
> > migrating from non-RAC to RAC, I would pay close
> > attention the nature,
> > frequency, and volume of UPDATE, SELECT ... FOR
> > UPDATE, and DELETE
> > statements generated by the application. While
> > still in its non-RAC
> > implementation, I would recommend collecting and
> > examining such SQL
> > statements generated by application and
> > understanding what program modules
> > are generating each statement and why. I would then
> > prioritize these
> > statements by their volume and the business
> > criticality of the generating
> > program module. Last, according to this
> > prioritization, I would examine how
> > the WHERE clauses and the data values used in them
> > can be controlled by
> > application logic. For example, if a
> > business-critical online form is
> > generating lots of UPDATE and SELECT ... FOR UPDATE
> > statements, is it
> > possible to determine whether those statements are
> > generated against rows
> > previously INSERTed by the same session? Or, does
> > that online form perform
> > UPDATE and SELECT ... FOR UPDATE operations against
> > any data in the database
> > at all?
> >
> > Some applications are really quite "partitionable"
> > under the covers, and
> > only a small amount of such analysis can assure you
> > that RAC is feasible.
> > Other applications are so dreadfully complex that
> > only by load-testing with
> > real-world data values can the scalability be
> > determined...
> >
> > Oracle has correctly identified all of the major
> > bottlenecks in
> > inter-instance contention and has improved each of
> > these areas in RAC since
> > OPS. The question is whether the improvements are
> > sufficient for your
> > requirements...
> >
> > ---
> >
> > Relevant wisdom from "The Meaning of Life" by Monty
> > Python (1983):
> >
> > Mother (going through labor pains, alert and
> > panicky): What's that for?
> >
> > Obstetrician #1: That's the machine that goes
> > "ping". << PING! >> You
> > see? That means your
> > baby is still alive!
> >
> > Obstetrician #2: And it's the most expensive
> > machine in the whole
> > hospital!
> >
> > Obstetrician #1: Yes, it cost over three
> > quarters of a million pounds!
> >
> > Obstetrician #2 (slowly and condescendingly):
> > AREN'T YOU LUCKY?
> >
> >
> > ----- Original Message -----
> > To: "Multiple recipients of list ORACLE-L"
> > <ORACLE-L_at_fatcity.com>
> > Sent: Wednesday, November 27, 2002 8:49 AM
> >
> >
> > > Thanks for taking time to reply, Cary. Much
> > > appreciated.
> > >
> > > Did I understand it correctly that in
> > active/active
> > > setup it would be beneficial to give each node
> > "it's
> > > own virtual empire" so to speak.
> > > Like one node to service say marketing and sales,
> > > while the other to deal with say inventory and
> > > automation and minimize interdependencies between
> > the
> > > two?
> > > I was thinking more along the lines of equally
> >
> === message truncated ===
>
> ______________________________________________________________________
> Post your free ad now! http://personals.yahoo.ca
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.com
> --
> Author: Boris Dali
> INET: boris_dali_at_yahoo.ca
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).
>

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Tim Gorman
  INET: Tim_at_SageLogix.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Received on Wed Dec 04 2002 - 22:33:43 CST