Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Quick Tale: Lost production database because of keyboard and Veritas Cluster Server

Re: Quick Tale: Lost production database because of keyboard and Veritas Cluster Server

From: Viper <viper_at_2ghz.net>
Date: Wed, 27 Feb 2002 23:38:44 -0500
Message-ID: <u7rcbilsmt3pbf@corp.supernews.com>


Well I guess the first question is what Type of database are you useing? If it is Oracle you should be able to recover that table space if not to badly damaged. If the DBF is screwed then did you have archive logging enabled?

"milkfilk" <milkfilk_at_yahoo.com> wrote in message news:90d82e70.0202271308.3056cee3_at_posting.google.com...
> I'm posting to multiple groups so that I might save someone's neck.
>
> Me:
> I'm not a DBA and I'm no Veritas expert.
>
>
> Background:
> We have two 420r's that are Clustered with Veritas Cluster Server
> (VCS) and are running Veritas Oracle agents for High Availability
> (HA). The servers are hooked up to two A5200 drive arrays that are
> mirrored. The disks on each array are striped (RAID 0+1).
>
> The cluster is configured for failover so if one server blows up, the
> other server mounts the disk and starts the oracle processes and
> starts up the db instances.
>
>
> What happened:
> I pulled the keyboard plug on our Sun server while rewiring our KVM
> switch. Yes, I know.
>
> What this does (according to usenet posts) unfortunately, is send a
> Stop-A signal (in the form of an electrical short, I suppose). This
> shouldn't be a problem, because the server is simply 'paused' and in
> most instances you can simply type go and there shouldn't be any large
> consequences. Of course, you can't expect to hit Stop-A all the time
> and get away with it.
>
> Our cluster is configured for failover, like I said, and the other
> server mounted the arrays and started up the oracle process. The
> instance hic-upped but it was running.
>
> Come monday, my DBA told me that the DB wasn't coming back up and we
> spent 17 hours finding out that the SCN number was off and our system
> table space was corrupted.
>
> We have an old backup but it's not great and we were in the process of
> getting our backup procedure tested / working.
>
> We looked at options with and underground tool "Data Unloader" ( DUL )
> which we were told would cost us $10,000 out front and would take a
> consultant 1 or 2 days to get out here. Our tables and columns would
> be unnamed (column001, column002 etc).
>
> Let me tell you that I'm shocked that an enterprise system can do
> this. A keyboard unplug started all of this. I'm looking at
> disabling the keyboard and this is my job as a UNIX SysAdmin to know
> this stuff, but the Veritas Cluster should have worked!
>
> If one server simply blows up, the other server should pick up the
> database and certainly not corrupt this "SCN" ...
>
> I know I'm going to get flamed because I broke the golden rule (always
> have a current backup), but it's more of a factor of my job &
> available time [ref: chickens with no heads]. And also, this
> avalanche was all started by myself screwing with a production system.
>
>
> But amazingly, I still have a job and I have a few comments /
> questions:
>
> 1. We have our backups square now and we are looking at Oracle
> Parallel Server. Anyone using Oracle Parallel Server with software
> clustering? I was reading about an extension to Veritas that allows
> you to mount a volume more than once (a limitation of Unix-ish systems
> - so I believe)
>
> 2. We are going to use archive log mode and cold backups.
> 3. Anyone having problems with Veritas Cluster Server?
> 4. Anyone have comments?
Received on Wed Feb 27 2002 - 22:38:44 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US