RE: Bank Databases

From: Michael Dinh <mdinh_at_XIFIN.Com>
Date: Mon, 25 Jun 2012 08:27:38 -0700
Message-ID: <D29F9902E534D5478F2E83FD6A44B30649BF83D04E_at_mail02.mba.xifin.com>



I have been having some issues with Oracle lately. With that being said, Oracle is most like the symptom and not the cause.

Michael Dinh
Disparity Breaks Automation (DBA)

Confidence comes not from always being right but from not fearing to be wrong - Peter T Mcintyre Great minds discuss ideas; average minds discuss events; small minds discuss people - Eleanor Roosevelt  When any rule or formula becomes a substitute for thought rather than an aid to thinking, it is dangerous and should be discarded -Thomas William Phelps  
-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Matthew Zito Sent: Monday, June 25, 2012 4:53 AM
To: Øyvind Isene
Cc: howard.latham_at_gmail.com; niall.litchfield_at_gmail.com; oracle-l Subject: Re: Bank Databases

Doh - resending as got dinged for overquoting:

Timely enough, the Register is reporting that CA's job scheduler software may be responsible:

http://www.theregister.co.uk/2012/06/25/rbs_natwest_what_went_wrong/

Could certainly mean that Oracle was still involved (or Sybase, or some other database), but the inability to schedule jobs was the root issue.

Matt

>>>
>>> I'm particularly interested as we test our failover every 3 months and
>>> last
>>> time we did so there was a power outage on the standby which was running
>>> temporarily as primary which we hadn't anticipated. The start up script
>>> tried to bring what was currently a primary db as a standby. I'm trying to
>>> automate this and yuk without dg broker which has its own set of problems
>>> I'm a bit stymied!
>>> I'm not suggesting Nat West hadn't tested thir failover , but imagine its
>>> difficult due to volumes.
>>> On 25 June 2012 12:08, Matthew Zito <matt_at_crackpotideas.com> wrote:
>>> > Yes, though I doubt it's anything as simple as an "Oracle issue".
>>> > From my experience watching large organizations deal with complex
>>> > crises like this, typically it's a series of cascading failures - so
>>> > perhaps an Oracle database was involved, but many separate pieces had
>>> > to fail in order to get to this point.
>>> >
>>> > For example, I once saw a major global company's firmwide email system
>>> > go down for over a day due to a cascading series of:
>>> > - storage array failure
>>> > - misconfigured hardware
>>> > - engineer typo
>>> > - misunderstood recovery architecture
>>> >
>>> > I'm trying to keep it vague intentionally, but if any one of those
>>> > things hadn't happened, they would have had an hour downtime on their
>>> > email instead of a 30 hour downtime.  I suspect the natwest issue is
>>> > similar, *though* I do expect that we'll get more info in the coming
>>> > days/weeks, so maybe we can get some more details then.
>>> >
>>> > Matt
>>> >
>>> > On Mon, Jun 25, 2012 at 7:01 AM, Howard Latham <howard.latham_at_gmail.com>
>>> > wrote:
>>> > >
>>> > > So Nat west being unable to process transactions for 5 days due to a
>>> > change
>>> > > in backup software and  fail over could well be an Oracle issue.
>>> > >
>>> > > --
>>> > > Howard A. Latham

--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jun 25 2012 - 10:27:38 CDT

Original text of this message