Re: [Q] How does Oracle 9i-RAC handle lock to suport HA?

From: Pete Sharman <peter.sharman_at_oracle.com>
Date: Fri, 11 Jan 2002 08:56:18 -0800
Message-ID: <7NE%7.3$CQ1.200@inet-nntp1.oracle.com>
Comments inline.
--
HTH.  Additions and corrections welcome.

Pete
Author of "Oracle8i: Architecture and Administration Exam Cram"

"Controlling developers is like herding cats."
Kevin Loney, Oracle DBA Handbook

"Oh no, it's not.  It's much harder than that!"
Bruce Pihlamae, long-term Oracle DBA

"Hun Soon Lee" <hunsoon_at_etri.re.kr> wrote in message
news:a1mg5h$tma$1_at_news.kreonet.re.kr...


> First of all, thanks for your answer.

> I'm sorry for my poor English.

> What I really want to know is how to recover the lock information managed

by


> failed node - such as who has the recent version of block or who has the

> exclusive/shared mode lock about this table/row among several instance -

> when instance failure is detected. I.e., how is the content of GCS and GES

> enqueue recovered.

It's done automatically for you.  As soon as we know that an instance is no
longer accessible, cache cleanup is started by the surviving nodes.  One of
the major improvements in 9i RAC compared to OPS is in this area.  In OPS,
all nodes release their locks and then take them out again, leading to a
brownout time while DLM recovery takes place.  In RAC only the node that has
died needs to be cleaned up.


>

> > Now, it's not really that simple, because the GCS CAN'T be on one node

>

> > only - single point of failure which is what we want to get away from

for


>

> > HA. The GCS is actually duplicated on each node that is running an

Oracle


>

> > instance.

>

>  Does "The GCS is actually duplicated on each node that is running an

Oracle


> instance" means following?

>

> - the GCS and GES enqueue information about resources(data) mastered by

one


> instance is duplicated on more than two node.

Yes, if there are more than two nodes running the RAC environment.


>

> - when instance failure is detected, GCS and GES enqueue is reconstructed

> using duplicated information

Yes.


>

> Thanks for your reading.

>

> Hun Soon LEE

>

>

>

>

>

> "Pete Sharman" <peter.sharman_at_oracle.com> wrote in message

> news:NHi%7.3$MX6.24_at_inet-nntp1.oracle.com...

> > I'm not 100% clear on what you're asking for here, but here's a summary

I


> > sent to someone recently:

> >

> > In the RAC environment, the DLM you refer to is called the global cache

> > service. It keeps track of the instances that are locking blocks. Let's

> take

> > a simple example of a 3 node cluster with instances 1 and 2 (located on

> > nodes 1 and 2) interested in a block, and the global cache service

located


> > on node 3. Instance 1 wants to take out an exclusive lock on a block

which


> > has an SCN of 1000. First thing it does is ask the GCS who has that

block.


> > If no-one is using it, the GCS tells instance 1 that, and instance 1

then


> > reads the block from disk and communicates the fact that it has taken

out


> an

> > exclusive lock to the GCS. GCS records the fact that instance 1 has the

> > block in exclusive mode, local mode (i.e. in one cache only) with no

past


> > images (more on this anon) - status is XL0 for instance 1 (X =

exclusive,


> L

> > = local, 0 = no past image)

> >

> > Instance 1 now makes a change and commits it (SCN is now 1001). Again it

> > communicates this to the GCS, which updates the SCN information it's

> > tracking. Status is still XL0.

> >

> > Instance 2 now decides it wants to update another row or even the same

row


> > in the same block (Instance 1 has committed so another update to the

same


> > row is fine). Instance 2 asks the GCS who has the block. GCS tells

> instance

> > 2 that instance 1 has it, and tells instance 1 to send it to instance 2

> > ACROSS THE INTERCONNECT (i.e. no pinging to disk which is where the

> > performance hit came in OPS). Instance 1 either gets it from the data

> buffer

> > cache if it's still there or from its rollback (or undo) segments, sends

> the

> > block across the interconnect and then tells the GCS it has it. GCS

> changes

> > the status for instance 1 to NG1 (N = null - no lock, G = global because

> the

> > block is now in two caches, 1 = past image - this is no longer the most

> > recent copy of the block). Instance 2 receives the block, tells the GCS

it


> > has it, so the GCS now changes the status for instance 2 to XG0

(exclusive


> > lock, global mode, 0 means this is the current image). Instance 2 can

now


> > change the block and do whatever it wants.

> >

> > Now, it's not really that simple, because the GCS CAN'T be on one node

> > only - single point of failure which is what we want to get away from

for


> > HA. The GCS is actually duplicated on each node that is running an

Oracle


> > instance. That way, each node can tell very quickly where the latest

copy


> of

> > a block is, and can request it from the instance holding that latest

copy.


> > Of course explaining it with the GCS on each node gets way complex with

> > arrows pointing every which way when you do it on a whiteboard, so

that's


> > why I stick to the simple example when explaining it.

> >

> >

> > --

> > HTH.  Additions and corrections welcome.

> >

> > Pete

> > Author of "Oracle8i: Architecture and Administration Exam Cram"

> >

> > "Controlling developers is like herding cats."

> > Kevin Loney, Oracle DBA Handbook

> >

> > "Oh no, it's not.  It's much harder than that!"

> > Bruce Pihlamae, long-term Oracle DBA

> >

> > "Hun Soon Lee" <hunsoon_at_etri.re.kr> wrote in message

> > news:a1k267$hdu$1_at_news.kreonet.re.kr...

> > > Hello there.

> > > I want to know how Oracle9i-RAC handles the lock to suport HA.

> > > According to the "Oracle9i-RAC concepts",

> > > it uses DLM(Distribute Lock Manager) to manage the lock.

> > > I want to know details about how Oracle 9i-RAC handles the lock.

> > > Does it backup the lock manager to support HA?

> > > Does it use the lock caching to increase performance?

> > > etc...

> > > Thanks for your reading.

> > > Hun Soon LEE

> > >

> > >

> >

> >

>

>
Received on Fri Jan 11 2002 - 10:56:18 CST