RE: oracle clustering

From: Chuck Hamilton <chuck_hamilton_at_yahoo.com>
Date: Thu, 30 Nov 2000 11:16:40 -0800 (PST)
Message-Id: <10696.123345@fatcity.com>

--0-2080421154-975611800=:98453

Content-Type: text/plain; charset=us-ascii

9i is still vapor ware though. I wouldn't count on it containing phase 2 cache fusion until I actually see it. Phase 2 cache fusion was originally promised in 8i but never made it. What 8i got was phase 1. In phase 1, only RBS blocks needed for consistency are pinged over the interconnect. The actual data block is still pinged via the disk so there are at least two physical i/o's for every block pinged. Phase 2 will use the interconnect to ping the fully reconstructed read consistent block over the interconnect which should substantially improve performance. 10-15 minutes seems like an awfully long time for your non-OPS instance to fail over. Depending on the size of my redo logs, I usually see it happen on SGI or IBM clustering software in about 3 minutes. The smaller the redo logs, the faster it occurs. But smaller redo logs also mean more frequent checkpointing which is something I normally try to avoid. If I had a choice between fast fail over with checkpointing every couple of minutes, and slower fail over with checkpoints every hour, I'll take the latter any day. After all, how often do you fail over? Compare that with the performance degradation of frequent checkpoints. "Adams, Matthew (GEA, 088130)" <MATT.ADAMS_at_APPL.GE.COM> wrote: What Chuck says is true to a point. However, they failover and recovery times for and OPS cluster and a non-ops cluster can be significantly different.

We use HP ServiceGuard clustering and OPS.

A failover using OPS takes
about 30 seconds at most (for us) and requires NO manual intervention to accomplish the reocovery.

A failover using ServiceGuard Clustering takes about 10-15 minutes and may
(under rare circumstances) require
manual intervention to accomplish the recovery.

Also, pinging became less of a problem with the dynamic PCM locking (8.0) and even less of an issue with Cache Fusion (8.1.6). Supposedly, it goes away entire with 9i.

R. Matt Adams - GE Appliances - matt.adams_at_appl.ge.com It will make sense when you stop thinking logically and start thinking Oracle-ly - Jim Droppa

-----Original Message-----

]
Sent: Wednesday, November 29, 2000 12:21 PM To: Multiple recipients of list ORACLE-L

Then the answer is no. Oracle clustering is not the same as OPS. OPS is not necessary for high availability either, though it does make the database more highly available then single instance oracle on a cluster.

I once thought OPS gave you 100% availability because multiple instances were running on separate machines sharing the same physical database. If one machine or instance failed, the others continued to run uninterruppted. The 2nd (and 3rd, 4th,... Nth) instances do continue to run but there is an interruption in service when any one instance fails. There is a period of time known as a "brown out" where *no* work can be done on the database. During that time the DLM (distibuted lock manager) which controls concurrency between the instances must reconfigure itself to run accross only the surviving instances. Then an instance recovery must be performed on behalf of the failed instance. Only after this is done can work continue on the database. When the failed instance comes back online the DLM must be reconfigured again to include the new instance. OPS is also only available on a few platforms (IBM, HP, SUN, and Compaq).

Single instance oracle clustering OTOH is implemented entirely through the platforms cluster management software. The instance runs on only one node at a time. Should the instance or node fail, the cluster manager software shuts the instance down, unmounts the disks, remounts them on another node, and starts the oracle instance on that node.

Single instance oracle running on a cluster manager is still highly available and much easier to administer than OPS. For example, your application must be redesigned around OPS. In order to reduce pinging of blocks between nodes (which requires physical disk i/o much of the time), you need to segregate your apps and users amongst the nodes in such a way as to eliminate reduce the chance that multiple nodes will need access to the same database blocks. This is not always an easy task. If you don't do this you could see your performance go down the toilet even though you're adding more nodes and cpus. Trying to identify contention between instances is not an easy task either.

Having been to the OPS class and actually working with it, I'm not convinced it buys me enough availability to warrant it's use instead of single instance oracle on a cluster. Either way when an instance fails you have to perform instance recovery. The only thing OPS saves you is the time it takes to move the disk mounts and start an instance. And the cost for that small amount of time is an application redesign. Only data warehouses typically can be dropped righ on to OPS with little or no changes because of there read-only nature.

HTH Chuck Hamilton

Herman wrote:

what i mean is the clustering to obtain high availability.

and i heard this can be achieved thru OPS,

anyone can comment about this ?

thanks &

regards

Herman

Original Message -----

To: Multiple recipients of list ORACLE-L

Sent: Tuesday, November 28, 2000 1:25 AM

I'm not sure what you're referring to by "oracle clustering". Is this a product you've heard of? I'm not familiar with it. Inside the database, clustering is the term for nesting multiple tables within the same segment for faster joins. Could that be what you're thinking about? Outside the database, clustering is a means of attaining high availability and is required for OPS. Single instance Oracle can also run on a cluster apart from OPS but is not quite as highly available.

Herman wrote:

hello all,

can somebody plz help me to explain about oracle clustering ? is oracle clustering equal to Oracle Parallel Server ? is it considered as software clustering ?

wha'ts the different, advantage and disadvantage bettween software clustering and hardware clustering anyway ? can we combine those two when we implement OPS ?

thanks
and regards
Herman

--

Please see the official ORACLE-L FAQ: http://www.orafaq.com
--

Author: Herman
INET: Sherman_at_bcsis.com

Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include! ! ! ! ! ! a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).

Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores.
Millions of Products.

--

Please see the official ORACLE-L FAQ: http://www.orafaq.com
--

Author: Adams, Matthew (GEA, 088130)
INET: MATT.ADAMS_at_APPL.GE.COM Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists

To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).

Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
--0-2080421154-975611800=:98453

Content-Type: text/html; charset=us-ascii

<P> 9i is still vapor ware though. I wouldn't count on it containing phase 2 cache fusion until I actually see it. Phase 2 cache fusion was originally promised in 8i but never made it. What 8i got was phase 1. In phase 1, only RBS blocks needed for consistency are pinged over the interconnect. The actual data block is still pinged via the disk so there are at least two physical i/o's for every block pinged. Phase 2 will use the interconnect to ping the fully reconstructed read consistent block over the interconnect which should substantially improve performance. 
<P>10-15 minutes seems like an awfully long time for your non-OPS instance to fail over. Depending on the size of my redo logs, I usually see it happen on SGI or IBM clustering software in about 3 minutes. The smaller the redo logs, the faster it occurs. But smaller redo logs also mean more frequent checkpointing which is something I normally try to avoid. If I had a choice between fast fail over with checkpointing every couple of minutes, and slower fail over with checkpoints every hour, I'll take the latter any day. After all, how often do you fail over? Compare that with the performance degradation of frequent checkpoints.
<P>&nbsp; <B><I>"Adams, Matthew (GEA, 088130)" &lt;MATT.ADAMS_at_APPL.GE.COM&gt;</I></B> wrote: <BR>
<BLOCKQUOTE style="BORDER-LEFT: #1010ff 2px solid; MARGIN-LEFT: 5px; PADDING-LEFT: 5px">What Chuck says is true to a point. However, they failover<BR>and recovery times for and OPS cluster and <BR>a non-ops cluster can be significantly different.<BR><BR>We use HP ServiceGuard clustering and OPS.<BR><BR>A failover using OPS takes<BR>about 30 seconds at most (for us) and requires<BR>NO manual intervention to accomplish the reocovery.<BR><BR>A failover using ServiceGuard Clustering<BR>takes about 10-15 minutes and may <BR>(under rare circumstances) require<BR>manual intervention to accomplish the recovery.<BR><BR>Also, pinging became less of a problem with the <BR>dynamic PCM locking (8.0) and even less of an<BR>issue with Cache Fusion (8.1.6). Supposedly, <BR>it goes away entire with 9i.<BR><BR><BR><BR>---- <BR>R. Matt Adams - GE Appliances - matt.adams_at_appl.ge.com <BR>It will make sense when you stop thinking logically <BR>and start thinking Oracle-ly - Jim Droppa <BR><BR><BR><!
BR>-----Original Message-----<BR><?XML:NAMESPACE PREFIX = MAILTO /><MAILTO:CHUCK_HAMILTON_at_YAHOO.COM>]<BR>Sent: Wednesday, November 29, 2000 12:21 PM<BR>To: Multiple recipients of list ORACLE-L<BR><BR><BR><BR>Then the answer is no. Oracle clustering is not the same as OPS. OPS is not<BR>necessary for high availability either, though it does make the database<BR>more highly available then single instance oracle on a cluster. <BR><BR>I once thought OPS gave you 100% availability because multiple instances<BR>were running on separate machines sharing the same physical database. If one<BR>machine or instance failed, the others continued to run uninterruppted. The<BR>2nd (and 3rd, 4th,... Nth) instances do continue to run but there is an<BR>interruption in service when any one instance fails. There is a period of<BR>time known as a "brown out" where *no* work can be done on the database.<BR>During that time the DLM (distibuted lock manager) which controls<BR>concurrency between the !
instances must reconfigure itself to run accross<BR>only the surviving instances. Then an instance recovery must be performed on<BR>behalf of the failed instance. Only after this is done can work continue on<BR>the database. When the failed instance comes back online the DLM must be<BR>reconfigured again to include the new instance. OPS is also only available<BR>on a few platforms (IBM, HP, SUN, and Compaq).<BR><BR>Single instance oracle clustering OTOH is implemented entirely through the<BR>platforms cluster management software. The instance runs on only one node at<BR>a time. Should the instance or node fail, the cluster manager software shuts<BR>the instance down, unmounts the disks, remounts them on another node, and<BR>starts the oracle instance on that node.<BR><BR>Single instance oracle running on a cluster manager is still highly<BR>available and much easier to administer than OPS. For example, your<BR>application must be redesigned around OPS. In order to reduce pingi!
ng of<BR>blocks between nodes (which requires physical disk i/o much of the time),<BR>you need to segregate your apps and users amongst the nodes in such a way as<BR>to eliminate reduce the chance that multiple nodes will need access to the<BR>same database blocks. This is not always an easy task. If you don't do this<BR>you could see your performance go down the toilet even though you're adding<BR>more nodes and cpus. Trying to identify contention between instances is not<BR>an easy task either. <BR><BR>Having been to the OPS class and actually working with it, I'm not convinced<BR>it buys me enough availability to warrant it's use instead of single<BR>instance oracle on a cluster. Either way when an instance fails you have to<BR>perform instance recovery. The only thing OPS saves you is the time it takes<BR>to move the disk mounts and start an instance. And the cost for that small<BR>amount of time is an application redesign. Only data warehouses typically<BR>can be dropped !
righ on to OPS with little or no changes because of there<BR>read-only nature.<BR><BR>HTH<BR><BR>Chuck Hamilton<BR><BR><BR>Herman <SHERMAN_at_BCSIS.COM>wrote: <BR><BR><BR><BR><BR><BR><BR>what i mean is the clustering to obtain high availability.<BR><BR>and i heard this can be achieved thru OPS,<BR><BR>anyone can comment about this ?<BR><BR><BR><BR>thanks &amp;<BR><BR>regards<BR><BR><BR><BR>Herman<BR><BR><BR><BR><BR><BR>----- Original Message ----- <BR><BR><BR>To: Multiple recipients of list ORACLE-L <MAILTO:ORACLE-L_at_FATCITY.COM><BR><BR>Sent: Tuesday, November 28, 2000 1:25 AM<BR><BR><BR><BR><BR><BR>I'm not sure what you're referring to by "oracle clustering". Is this a<BR>product you've heard of? I'm not familiar with it. Inside the database,<BR>clustering is the term for nesting multiple tables within the same segment<BR>for faster joins. Could that be what you're thinking about? Outside the<BR>database, clustering is a means of attaining high availability and is<BR>required for!
 OPS. Single instance Oracle can also run on a cluster apart<BR>from OPS but is not quite as highly available.<BR><BR><BR>Herman <SHERMAN@BCSIS.COM>wrote: <BR><BR><BR><BR>hello all,<BR><BR>can somebody plz help me to explain about oracle clustering ?<BR>is oracle clustering equal to Oracle Parallel Server ?<BR>is it considered as software clustering ?<BR><BR>wha'ts the different, advantage and disadvantage bettween software<BR>clustering and hardware clustering anyway ?<BR>can we combine those two when we implement OPS ?<BR><BR><BR>thanks<BR>and regards<BR>Herman<BR><BR>-- <BR>Please see the official ORACLE-L FAQ: http://www.orafaq.com<BR>-- <BR>Author: Herman<BR>INET: Sherman@bcsis.com<BR><BR>Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051<BR>San Diego, California -- Public Internet access / Mailing Lists<BR>--------------------------------------------------------------------<BR>To REMOVE yourself from this mailing list, send an E-Mail message<BR>to: ListGuru@!
fatcity.com (note EXACT spelling of 'ListGuru') and in<BR>the message BODY, include! ! ! ! ! ! a line containing: UNSUB ORACLE-L<BR>(or the name of mailing list you want to be removed from). You may<BR>also send the HELP command for other information (like subscribing).<BR><BR><BR><BR><BR><BR><BR>_____ <BR><BR><BR>Do You Yahoo!?<BR>Yahoo! Shopping - Thousands of Stores.<BR>Millions of Products.<BR><BR><BR><BR><BR>_____ <BR><BR>Do You Yahoo!?<BR>Yahoo! Shopping - Thousands of Stores.<BR>Millions of Products.<BR><BR>-- <BR>Please see the official ORACLE-L FAQ: http://www.orafaq.com<BR>-- <BR>Author: Adams, Matthew (GEA, 088130)<BR>INET: MATT.ADAMS@APPL.GE.COM<BR><BR>Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051<BR>San Diego, California -- Public Internet access / Mailing Lists<BR>--------------------------------------------------------------------<BR>To REMOVE yourself from this mailing list, send an E-Mail message<BR>to: ListGuru@fatcity.com (note EXACT spelli!

ng of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).</BLOCKQUOTE></MAILTO:ORACLE-L_at_FATCITY.COM></MAILTO:CHUCK_HAMILTON_at_YAHOO.COM> <hr size=1>Do You Yahoo!? Received on Thu Nov 30 2000 - 13:16:40 CST