Re: Protecting production from "us"

From: Alvaro Jose Fernandez <alvaro.fernandez_at_sivsa.com>
Date: Fri, 4 Dec 2015 08:11:50 +0000
Message-ID: <CF6EF79EAD543340B39CD7DE45BCBBE4144B1519_at_CANIDO.sivsa.int>



I remember many years ago a related situation. I was typing on a shell session over a big HP-UX system. Then, a guy (former external consultant, not working in the same business as mine's, and a quite smart person btw) entered the room, approached to *my* keyboard and...to my dismay typed "rm -rf *", while smiling and saying "hoho! I too know UNIX!". The whole thing happened in a mere 5 or 6 seconds. I quickly typed "Control-C" trying to abort ..too late. At this moment my current working directory was...the datafiles folder of the main app's database.

When the guy understood what he did he panicked, of course. We recovered the database from backups, and no one was fired. Fortunately at the time being , the database/application deployment was only in the initial stages, so no real harm happened, but truly...accidents happen!

Alvaro



De: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] en nombre de Alfredo Abate [alfredo.abate_at_gmail.com] Enviado: jueves, 03 de diciembre de 2015 21:49 Para: Jeremy Schneider
CC: HerringD_at_dnb.com; Oracle-L
Asunto: Re: Protecting production from "us" I'm disappointed at management's response of the backlash is now any further mistakes on production will result in immediate termination. I don't see how any person (in any field) could work knowing that if they make another mistake like this that they are terminated. Especially given the track record that it sounds you and the rest of your team has had (years between this happening). If someone was making these types of mistakes frequently then that is another story all together.

I suppose if the system at hand cost a company tons of money for each outage (say a trading system or high volume eCommerce) then things might be a little different (maybe this is the case here).

At the end of the day these machines, systems, etc are all operated by the all mighty error prone humans.

Alfredo

On Thu, Dec 3, 2015 at 2:27 PM, Alfredo Abate <alfredo.abate_at_gmail.com<mailto:alfredo.abate_at_gmail.com>> wrote: I like Jeremy's server side control better for the terminal background colors. I'll have to look into that one.

Thanks for that tip.

Alfredo

On Thu, Dec 3, 2015 at 1:05 PM, Jeremy Schneider <jeremy.schneider_at_ardentperf.com<mailto:jeremy.schneider_at_ardentperf.com>> wrote: On Thu, Dec 3, 2015 at 11:45 AM, Herring, David <HerringD_at_dnb.com<mailto:HerringD_at_dnb.com>> wrote:
> * Should we look into some kind of additional controls where
> commands like "srvctl stop..." cannot be run under our own accounts using
> "sudo -u oracle" but instead need a different account on production? For
> example, normally our unfortunate DBA would use his "scapebob" Linux account
> but perhaps to perform a production shutdown he'd need to connect as
> "scapebob-rw", a new, special account just for dangerous production
> activities.

I think that I'd be hesitant to introduce too much variation between production and test environments when it comes to processes. It's a major advantage if you can test your processes in the test tier, then run those same processes verbatim (key-for-key) in production afterwards.

> * The problem in our situation was over confusion with multiple
> windows. Do people set a Linux TMOUT to something short like 10 or 15
> minutes, to hopefully avoid accidentally leaving production putty sessions
> open?

I feel like a short timeout is likely to cause more frustration in the trenches than what it's worth, for anyone who spends any significant amount of time troubleshooting production systems. Often you have multiple windows open and switch between them... an aggressive timeout really makes that much more difficult.

> * Beyond changing the linux prompt and text colors (we set $PS1 with
> escape sequences and various key, env-specific values) do you do anything
> else for protection of production?

Personally, I think background color is your best bet. Only difference from Alfredo's suggestion would be that I'd prefer having it be controlled server-side rather than relying on each engineer to setup all their terminal connections correctly. Not to mention that you could get the *wrong* bg color if it's client-side and somehow somebody ssh's between tiers.

--
http://about.me/jeremy_schneider
--
http://www.freelists.org/webpage/oracle-l




--
http://www.freelists.org/webpage/oracle-l
Received on Fri Dec 04 2015 - 09:11:50 CET

Original text of this message