RE: 1 minutes: best downtime story

From: MacGregor, Ian A. <ian_at_slac.stanford.edu>
Date: Fri, 15 Mar 2013 09:00:23 -0700
Message-ID: <FD1D618E4F164D4C8BA5513D4268174A01A19E536B3E_at_EXCHCLUSTER1-02.win.slac.stanford.edu>



Another type of downtime. at a savings and loan, it was go-live day for a new system which had performed quite in testing. The system had several components such as Loans on Line, Savings on Line, etc. Shortly after the system went live a few problems were noticed, but the real problem as how errors were reported on the system. The decision had been made to use the initial letters of each component plus a number. A teller checking an account balance, might the message SOL 32. Soon the messages were common enough that users of the system were asking each other whether they were SOL. Then they began to to tell customers I'm sorry you're SOL. This soon reached the powers-tut-be in the savings and loan, and the system was ordered shut down. The cause of the problem itself was found and corrected quickly, but the system had to stay down until the error reporting code was released.

Ian A. MacGregor
SLAC National Accelerator Laboratory
Computing Division
To offer the best IT service at the lab and be the IT provider of choice.

________________________________________--
http://www.freelists.org/webpage/oracle-l Received on Fri Mar 15 2013 - 17:00:23 CET

Original text of this message