RE: SAC NORAD .... how to break it?

From: Steve Orr <sorr_at_arzoo.com>
Date: Mon, 16 Oct 2000 13:54:07 -0700
Message-Id: <10651.119382@fatcity.com>

Apart from the obvious it's going to vary so you'll need to create your own checklist of things to test for your particular environment. You'll also want to test that you have good monitoring in place. I once thought I was protected with mirrored redo logs only to find out one drive had failed a month before and the sysadmin wasn't monitoring the mirror. You should probably start by taking your sysadmin out to lunch.

Steve

-----Original Message-----
From: root_at_fatcity.com [mailto:root_at_fatcity.com]On Behalf Of Seley, Linda
Sent: Monday, October 16, 2000 1:50 PM
To: Multiple recipients of list ORACLE-L Subject: RE: SAC NORAD .... how to break it?

Steve/Anyone -

We're going to be doing HA in the near future. I'm sure I can come up with obvious tests, like pull a disk, turn off a machine.... But do you have a 'break it' checklist? (sorry if this has been asked before) Especially how it affects Oracle performance. I want this thing to run smoothly.

Unless of course we have a 'real' test of Norad in which case I'm close enough that I don't care!

Linda

-----Original Message-----
Sent: Monday, October 16, 2000 11:36 AM
To: Multiple recipients of list ORACLE-L systems

Actually, NORAD was designed to survive a direct hit as was capable during the time it was build. However, with more accurate delivery systems now it is conceivable that a missle could navigate part way through the entrance tunnel so as to make the facility inoperable. Then there are multiple direct hits...

But of course, none of this has been tested and sadly, this is often the case with HA 24X7 systems. You need sufficient pre-production quiet time to test your HA solution. I call it the "pseudo sledge hammer" testing period. Have you ever taken a drive out of your RAID and replaced it to see how long it takes for resilvering and what happens to I/O performance? How much time does it take to test the entire HA implementation and how much time will you be given? The trouble is that you get all this expensive equipment in the data center and install Oracle then damagement is anxious to get the entire application up and running ASAP and asks you to take short cuts or just trust that everything will work. But really you haven't finished the job until you've reasonably tested everything end to end.

IMHO,
Steve Orr

-----Original Message-----
Sent: Monday, October 16, 2000 7:06 AM
To: Multiple recipients of list ORACLE-L systems

That's why they say that SAC/NORAD ( Strategic Air Command HQ, North American Defense ) buried deep into a mountain in Colorado is a "single point of failure" for the US NationalDefense:

All it takes is a direct hit by one nuclear bomb to bring down the whole facility! :-)

In the words of the Marathon Man's tormentor:

"Is it safe?"

-----Original Message-----
Sent: Friday, October 13, 2000 7:45 PM
To: Multiple recipients of list ORACLE-L

Sorry Ross. Yes I am familiar with enterprise class storage systems.

It still isn't HA.

It only takes one bumbling SA ( or DBA ) to bring the system down, one neanderthalic techie in the computer room to push the 'OFF' switch.

Simultaneous failure of both of the controllers for an array, or of enough disks to bring the array down are not unheard of.

Jared

On Fri, 13 Oct 2000, Mohan, Ross wrote:

> I have to say this "disk is a single point of failure"
> is jangling to the cognitive logic subsystem.
>
> Why?
>
> Well, the disk farms i have seen have redundant controllers,
> with redundant channels, TRIPLE power supplies, at least a
> single mirror with dual porting. There's your "single" disk
> point of failure for you.
>
> Now, try this: Take your two "redundant" nodes....put them
> in a really really big rack and then inside ONE big box. <G>
>
> Are the two nodes ( which now have at least redundant CPUs,
> power supplies, etc. ) a "single point of failure"?
>
> Come on, guys, if you've worked with this stuff a bunch you know:
>
> (a) properly configured diskfarms have a great MTBF, better
> than the other hardware, and
> (b) to REALLY answer Mary's class of questions, you need to
> calculate MTBFs and MTTRs.
>
> The rest is armchair clustering!
>
> hope this pertains,
>
> Ross Mohan
>
> p.s. HA is the latest marketspeak for "failover" or "redundant" or
> whatever...
> please try to browse a copy of "In Search of Clusters" by Gregory Pfister
> from
> IBM. It's a cult classic, a helluva fun read, and one of the best
> thought-out
> technical books i have ever seen, period.
>
>
> -----Original Message-----
> Sent: Thursday, October 12, 2000 2:00 PM
> To: Multiple recipients of list ORACLE-L
>
>
>
> Mary,
>
> OPS is not an HA solution. While you may still have
> an instance running if a node goes down, the storage
> medium is still a single point of failure.
>
> Jared
>
> On Thu, 12 Oct 2000, Ruiz, Mary A (CAP, CDI) wrote:
>
> > I need a little advice. We have a fairly new (< 1 year) 8.1.5 instance
> to
> > support my company's internet business. We recently changed our network
> > solutions provider and now my management wants to achieve a higher level
> of
> > redundancy than it currently does with mirrored disks. The solution
being
> > proposed by my Sysadmin is an Oracle Parallel Server solution. Some
> > background is in order here - we have always shut our databases down at
> > night for backups. I am not highly skilled in backup and recovery
> although
> > I tried some of the hot backup techniques from this list and was able to
> > recover successfully to another server. I noticed that the course
offered
> > by Oracle in OPS has backup and recovery as well as performance tuning
as
> > pre-requisites, which indicates to me that OPS could be extremely
> > challenging. Also, I have read mainly unfavorable comments about OPS
from
> > this list, but most of those comments were based on the Oracle 7
> > implementations (High administrative costs, difficult to implement,
etc.).
>
> >
> > Have things improved with Oracle 8i ? Is OPS worth pursuing? Or should
I
> > convince my management that extra $$ spent in, say, a hot standby
database
> > is well worth it? Is there any other solution that would not involve a
> > second set of disks, rather a second database on the same set of disks
??
> >
> > Thanks in advance,
> > Mary Ruiz / Atlanta
> >
> > --
> > Please see the official ORACLE-L FAQ: http://www.orafaq.com
also send the HELP command for other information (like subscribing).

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Steve Orr
  INET: sorr_at_arzoo.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Seley, Linda
  INET: LSeley_at_IQNavigator.com

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in

Received on Mon Oct 16 2000 - 15:54:07 CDT