Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Quick Tale: Lost production database because of keyboard and Veritas Cluster Server

Re: Quick Tale: Lost production database because of keyboard and Veritas Cluster Server

From: koert54 <koert54_at_nospam.com>
Date: Wed, 27 Feb 2002 21:31:46 GMT
Message-ID: <6fcf8.207524$rt4.21285@afrodite.telenet-ops.be>


> We looked at options with and underground tool "Data Unloader" ( DUL )
> which we were told would cost us $10,000 out front and would take a
> consultant 1 or 2 days to get out here. Our tables and columns would
> be unnamed (column001, column002 etc).
>

If you still have a copy (old, new or corrupted ... doesn't matter ) of the system tablespace they should be able to name all tables and columns ....

Sorry to hear what happened !

Just out of interest
- what were the initial ORA errors and what actions did the DBA(s) initially take ?
- if you're running a cluster then I'm thinking 24/24 7/7 - thus you should have hot backups ? where are do they fit in this whole story and why didn't the DBA restore from hot backup and recover ?

"milkfilk" <milkfilk_at_yahoo.com> wrote in message news:90d82e70.0202271308.3056cee3_at_posting.google.com...
> I'm posting to multiple groups so that I might save someone's neck.
>
> Me:
> I'm not a DBA and I'm no Veritas expert.
>
>
> Background:
> We have two 420r's that are Clustered with Veritas Cluster Server
> (VCS) and are running Veritas Oracle agents for High Availability
> (HA). The servers are hooked up to two A5200 drive arrays that are
> mirrored. The disks on each array are striped (RAID 0+1).
>
> The cluster is configured for failover so if one server blows up, the
> other server mounts the disk and starts the oracle processes and
> starts up the db instances.
>
>
> What happened:
> I pulled the keyboard plug on our Sun server while rewiring our KVM
> switch. Yes, I know.
>
> What this does (according to usenet posts) unfortunately, is send a
> Stop-A signal (in the form of an electrical short, I suppose). This
> shouldn't be a problem, because the server is simply 'paused' and in
> most instances you can simply type go and there shouldn't be any large
> consequences. Of course, you can't expect to hit Stop-A all the time
> and get away with it.
>
> Our cluster is configured for failover, like I said, and the other
> server mounted the arrays and started up the oracle process. The
> instance hic-upped but it was running.
>
> Come monday, my DBA told me that the DB wasn't coming back up and we
> spent 17 hours finding out that the SCN number was off and our system
> table space was corrupted.
>
> We have an old backup but it's not great and we were in the process of
> getting our backup procedure tested / working.
>
> We looked at options with and underground tool "Data Unloader" ( DUL )
> which we were told would cost us $10,000 out front and would take a
> consultant 1 or 2 days to get out here. Our tables and columns would
> be unnamed (column001, column002 etc).
>
> Let me tell you that I'm shocked that an enterprise system can do
> this. A keyboard unplug started all of this. I'm looking at
> disabling the keyboard and this is my job as a UNIX SysAdmin to know
> this stuff, but the Veritas Cluster should have worked!
>
> If one server simply blows up, the other server should pick up the
> database and certainly not corrupt this "SCN" ...
>
> I know I'm going to get flamed because I broke the golden rule (always
> have a current backup), but it's more of a factor of my job &
> available time [ref: chickens with no heads]. And also, this
> avalanche was all started by myself screwing with a production system.
>
>
> But amazingly, I still have a job and I have a few comments /
> questions:
>
> 1. We have our backups square now and we are looking at Oracle
> Parallel Server. Anyone using Oracle Parallel Server with software
> clustering? I was reading about an extension to Veritas that allows
> you to mount a volume more than once (a limitation of Unix-ish systems
> - so I believe)
>
> 2. We are going to use archive log mode and cold backups.
> 3. Anyone having problems with Veritas Cluster Server?
> 4. Anyone have comments?
Received on Wed Feb 27 2002 - 15:31:46 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US