Re: What are the differences between Real Application Clusters, Guard I and Guard II?

Home -> Community -> Usenet -> c.d.o.server -> Re: What are the differences between Real Application Clusters, Guard I and Guard II?

From: Howard J. Rogers <howardjr2000_at_yahoo.com.au>
Date: Tue, 9 Sep 2003 06:25:50 +1000
Message-ID: <3f5ce63e$0$28121$afc38c87@news.optusnet.com.au>

"Richard" <qaz1521_at_hotmail.com> wrote in message news:bjigc9$hjg$1$8300dec7_at_news.demon.co.uk...
> > Don't let Larry catch you saying that!!
>
> I'd better not! I once saw a video of him demonstrating OPS. Larry pulls
a
> lever in the data centre and a lightning bolt strikes one of his servers
> (shades of Michelangelo's 'Creation of Adam' here - what's going on in
> Larry's head?). Not to worry though. Larry's running OPS so everything's
> OK. Of course, in practice, Larry, his cluster and anything else in the
> vicinity would be black and crispy after an event like that but when did
> Oracle ever let the laws of physics interfere with marketing?

I think you're being very perceptive here (about Larry, I mean). Zzzzzzzip (my lips are now sealed).

>
> Cache Fusion is the buzzword that our Original
> > Poster probably wants to read up on, as compared with 'block pinging' or
> > 'forced disk writes'.
>
> I had a very brief look at an Oracle white paper about this. As far as I
> can see, this means that each instance has its own buffer cache and the
> caches are synchronised by passing data through the cluster interconnect
> rather than via the database files on disk as in OPS. Is this about
right?

I know I'm being picky. But the buffer caches aren't "synchronised". They're merely "co-ordinated". Synchronized, to me, means 'made to be at the same state, or the same point in time'. But that's not true. Yes each instance on each node has its own buffer cache... but it's contents are entirely independent of those on the other nodes'. Likewise, there is a 'yellow pages' of which instance currently holds copies of which database blocks (it's called the Global Resource Directory), and each instance has a part of that directory, so that when you put all instances' yellow pages together, you've got the complete volume. But again, there's no duplication, and no 'synchronisation' (whoever thought of the word 'fusion' for this wants to attend a physics lecture some time!).

So what happens when I want to update Bob's salary in EMP? I consult my local piece of the Global Resource Directory to see whether I've got the 'volume' which covers access to that block. If I discover that I am not the 'mastering instance' for it, I send a message to each of the other instances in turn, asking them whether they are the mastering instance. At some point, I will get lucky and find the mastering instance. So I send it another message saying 'can I have File 6 block 22 for an update?'. A new record is made in the Resource Directory on the other instance, and a message is sent back to me saying 'yes, but since no-one's got a copy of it in their buffer cache, you'll have to read it from disk'. So I do so. The EMP block is now in my buffer cache.

Then you, connected to yet another node, say you want the same block for an update of Sue's telephone number. So you check your local copy of the Resource Directory, and discover you are not the mastering instance. You therefore send messages out to all the other instances in turn asking if they are the mastering instance. You strike lucky when you consult the same instance I did. YOu request the block. The mastering instance sends you a message saying 'Hang on, I'll just get Howard's instance to send you the block, because it's more up to date in his instance than it is on disk'. My instance gets sentt a message saying 'pass that block to Richard's instance'. My instance complies. The block is sent, across the interconnect. Your instance confirms its arrival, mine the dispatch. The Resource Directory entry for the block is updated to record the fact that it is now in your instance.

I then say 'commit' for the Bob transaction. That means I need the block back again so I can remove the row level locks and so forth. So, message to the mastering instance: "I need it back". Message sent to your instance: "Give it back!". Block transmitted back across the interconnect, confirmation of dispatch sent by your instance, confirmation of receipt sent by yours.

And so it goes on. Cache fusion means nothing much more than messages going back and forth across the interconnect asking 'who is in charge of this resource' and 'may I have it?", and occasionally the data block itself getting sent via the same pathway. But there's no 'synchronisation' on all of that... just a directory master coordinating serialised access to the block.

> > >You can think
> > > of it as a number of physically separate computers each containing a
> > > synchronised copy of the database.
> >
> > No, you can't think of it that way at all. In a RAC, there is only ONE
> > database, but with multiple co-ordinated instances. There's no
duplication
> > of the database, and there's no duplication of the instances, either.
Each
> > instance is entirely independent of the others, but access to database
> > resources is co-ordinated between them by the cluster management and CGS
> > software components.
>
> You are absolutely right. I tried to oversimplify my response by
referring
> to the database instead of to the instance. What happens with the 'shared
> nothing' clusters that Oracle refers to?

I think this is where I might get out of my depth. As I understand it, Microsoft's 'Federated Database' counts as example of a 'shared nothing' cluster database. And it's really rather pathetic, and doesn't qualify as a cluster in my book at all. Take two independent databases, and create a bunch of views on the tables in one that include a union on the equivalent table in the other. Re-write your DML accordingly, too, so if you want to update row 1000 you do it in database A, and if you want to update row 30000 you know to do it in database B. Have a node failure, and you've lost half your rows.

At the machine level, Window's clustering is indeed shared nothing, in that (as I said last post) whilst one machine can write to the shared disk, the other can't see it at all. The idea is that all activity on the stuff on the "shared" disk is done by one node only. The other is just sitting there, waiting to take over. And the 'cleverness' bit of Microsoft clustering is that there is a software layer provided that spots the node failure, and causes the other machine to take ownership of the shared disk automatically, so that non-availability of the data or documents stored on that shared disk is very slight. IT's a valid approach, in other words, but it's not co-equal, joint custody of the data/disk, and I don't think that should be called true clustering.

Microsoft, by the way, point out that this doesn't mean the second node is just sitting there doing nothing. I could partition the shared disk in such a way that Node A owns half of it, and Node B owns the other half. Node A can't see anything happening on half 2, and Node B can't see anything happening on half 1. You then say 'half 1 is where we stick SQL Server' and 'half 2 is where we stick Exchange Server'. Therefore, both nodes are active, but working on totally separate things, with the other node ready and able to take up that role if a node failure occurs.

>If the nodes in the cluster have
> no shared disks, surely each node must maintain a local copy of all of the
> database files. I suppose if these files are synchronised by the
clustering
> system then, from a user perspective, you really only see one set of files
> so it looks like a shared disk cluster but without the single point of
> failure.

As I say, in a shared nothing environment, you have to "partition" your database, and arrange for something as simple as a union view to glue the bits back together. That's not a cluster, but a federation. And you could do exactly the same in Oracle with replication, database links and so on. A lot cheaper than RAC, that's for sure.

>
> > I have heard rumours of a RAC in Beijing where the nodes are separated
by
> > 18kms. Quite why anyone would do that, I have no idea. And it could just
> be
> > rumour.
> >
>
> I heard a similar story about American Express splitting an OPS cluster
> between Brooklyn and Manhattan. This must be to protect against
> catastrophe. Perhaps Beijing is prone to natural disasters or prolonged
> power cuts.

Agreed, but my point was that a RAC depends on high speed and low latency links, and that usually means the kind of thing that works well within the confines of a server room. As soon as you go 18Kms apart, you are either paying an absolute fortune for the sort of hardware that can be high speed and low latency across that sort of distance, or you would have been better off considering something which provides protection from disasters at a much lower cost... namely, replication, standby database, etc etc. They must have money to burn, was my real thought.

>
> > > Another problem is that some clusters consist of a number of nodes
> > connected
> > > to a communal disk array or hub.
> >
> > Er, all clusters do that. That's one of the definitions of a cluster ;-)
>
> Are you sure? I don't know enough about clustering hardware to speak with
> any authority but I thought some clusters used the 'shared nothing'
approach
> that Oracle refers to in its documentation. This would get around the
single
> point of failure problem caused by shared disk arrays or a hub as well as
> the proximity vulnerability caused by having to locate all your nodes
close
> together. I suppose there must be disadvantages with the shared nothing
> approach though (interconnect cost, latency etc) as it doesn't seem to be
> very popular.

See above. It's a valid approach, sure enough. But it's separate machines working on separate things, with the other machines ready to take over the job in the event of a failure. It's not all machines working on the same problem at the same time. I personally don't think that's true clustering. More like 'co-operating'!

>
> > Which makes the comment I had from one student all the more worrying:
"we
> > don't need to do backups because we've got RAC". Yikes!!
>
> Sounds like you need to offer the student's boss some consultancy (for a
> handsome fee of course).
>
> > Real Application Clusters Guard (which is what I assume the Original
> Poster
> > meant by "Guard II") is a sort-of Fail Safe for non-Windows products (or
> > even for Windows products). It's a 2-node solution that is genuinely a
> RAC.
> > An Instance runs on both nodes, accessing the one database, but Users
can
> > only connect to one of the instances (the "primary" instance). When the
> node
> > that instance is running on fails, various monitors that are installed
as
> > part of Clusters Guard detect the failure, and promote the secondary
node
> to
> > being the new primary. It was already running an instance (this is a
> genuine
> > RAC solution, after all, unlike Fail Safe), and so failover should be
> > quicker than for Fail Safe or other host-based solutions. Basically, you
> > install Clusters Guard when you want the redundancy features of RAC, but
> > don't need or want the scale-up and speed-up that RAC can often supply,
> and
> > hence one active node at a time is suitable for your use.
>
> Presumably it also has the cost advantages of the Fail Safe product.

No, because Clusters Guard is a true RAC product (There's more than one instance running at a time). Therefore the RAC licence is needed in all its dollar majesty. And, though I didn't mention this before, although the 'spare node' is unavailable for use by users (it's a tns thing that stops it accepting connections from them over the network) there's nothing to stop the DBA walking into the server room and logging onto the spare instance directly. Therefore you ahve a true 2-instance setup, and must pay accordingly. But I wouldn't know how much, since I try hard not to notice the price of things.

>
> I'm getting too old to keep up with all the new features in Oracle,
> especially now Oracle 10 is on the horizon. I feel a change of career
> coming on. Do you think Larry might be looking for a highly paid
lifestyle
> consultant?

You really, really don't want to go there! Chin up: in 10G, everything is self-automated, self-tuned, and the DBA can get on with more important work, such as web page development.

;-)
Regards
HJR
>
> Many thanks for your very informative response.
>
> Richard
>
>
Received on Mon Sep 08 2003 - 15:25:50 CDT