Re: Cron management...

From: MARK BRINSMEAD <mark.brinsmead_at_gmail.com>
Date: Sun, 12 Apr 2015 22:11:41 -0400
Message-ID: <CAAaXtLBCYrL5vtXz7=8j0_whH-LP81dPR1qU8S1i++_1XFTKMQ_at_mail.gmail.com>



I am in agreement with Seth on both points.

The sysadmins here are simply being cautious -- as well they should be. I, too, would be concerned about a network service that runs as "root" and can -- by design -- run any command as any user at any time, based on instructions received from a remote server, and I would also want to be convinced of its safety before deploying it. These are the sorts of things of which internaltion headline stories of massive security breaches are made. Which is not to say, of course, that there is anything *wrong* with the products you have mentioned -- just that the sysadmins in question are simply doing their jobs in asking for assurance that there isn't.

As for crontabs...

Managing crontabs across 30 servers can be a little unwieldy, but it is certainly possible. Here are a few things I have seen in the past:

  • Monitoring jobs that report -- and require acknowledgement -- when a crontab has been modified.
  • Monitoring jobs that report when backups have failed to run as scheduled. (Note: this is NOT the same as reporting "when a backups has run and failed".)
  • Source code control for administrative scripts (which can include your crontab, by the way). Something as simple as RCS can do. If you are paranoid, "check out" each of your scripts from the repository once per day. Let people make unauthorized changes -- who cares?
  • Keep a log of changes to the crontab. Something as simple as piping "crontab -l" to a file, and then checking that into an RCS archive will keep a very concise record of what was changed and when, although not necessarily by whom.

A centralized solution might be right for you. Maybe even for most people. But I would not say it is mandatory. By the way, what do you propose to do when somebody "accidentally deletes" all the scripts/configuration for the centralized job-scheduling facility? You'll probably need a plan for that.

As for the NetBackup maintenance in the middle of the day...

... well, let's hope that doesn't happen too often. Of course, for a backup utility "middle of the day" is about as off-peak as you are going to get, so this is a reality that you're probably going to have to accept.

Usually, the only backups I run during "the middle of the day" would be archivelog backups. On some systems I might run those every few hours, or even in extreme cases, once per hour. But I also try to structure and size my archivelog destinations such that I can go for up to 48 hours without running an archivelog backup (that is, 48 hours without deleting archivelogs) without worrying that the database is going to freeze. Nobody is going to tolerate me allowing the database to stop working for 48 hours, but I have seen backup infrastructures go offline for that long -- without anybody losing their jobs.

In any case, if you size your storage properly and run your archivelog backups with appropriate frequency, it should not be a "problem" if maintenance in the backup infrastruction (very) occasionally causes an archivelog backup to fail. The next one will run in an hour or two, the archivelogs will be backed up and deleted, and it will be like nothing ever happened. Assuming, of course, that your *(rman) backup scripts* are the only jobs that ever delete (or manipulate) archivelogs. I'm kind of religious about that last point -- having seen far too many failures arising from doing it any other way.

---

I hope this helps!



On Sun, Apr 12, 2015 at 8:01 PM, Seth Miller <sethmiller.sm_at_gmail.com>
wrote:


> Chris,
>
> Now that Mladen is done belittling your Linux admin for simply being
> cautious, the rest of us can offer you constructive advice.
>
> If you like your shell scripts and are comfortable with cron, you might be
> able to just enhance it enough to eliminate the single point of failure and
> dramatically reduce your risks by centralizing your backups.
>
> Modify your rman scripts to use an Oracle wallet to authenticate to the
> databases remotely through an rman client. That way, you can take a backup
> without having to be on the server and won't expose the password of a
> privileged account. I would also suggest creating a separate sysdba account
> just for the use of logging in to do the backups.
>
> You can reduce the single point of failure by using Oracle Clusterware to
> set up a failover resource that enables the crons to run on another node in
> the cluster if a node were to fail. This is relatively simple although you
> do need at least a basic Oracle Clusterware install to use it.
>
> Seth Miller
>
>
>
> On Sun, Apr 12, 2015 at 4:38 PM, Mladen Gogala <
> dmarc-noreply_at_freelists.org> wrote:
>
>> On 04/12/2015 02:47 PM, Chris Grabowy wrote:
>>
>>> Howdy.
>>>
>>> We currently have about 30 Redhat Linux servers running Oracle 11.2
>>>
>>> Recently for a short time the crontab entry for a production backup was
>>> commented out.
>>>
>>> Just last week one of the DBAs had "accidently" deleted all the backup
>>> scripts. The scripts directory is NFS mounted so it impacted every server.
>>>
>>> The Netbackup folks like to do maintenance during the day. Any Oracle
>>> backups that may have been running abort. These days we get notice from
>>> the Netbackup folks but it's kinda tricky to check 30 servers and determine
>>> if anything is running. Or kick off 30+ archive log backup scripts across
>>> all the servers to clean up the archive log directories before the
>>> Netbackup maintenance.
>>>
>>> Managing crontabs, jobs and scripts across 30 servers just doesn't seem
>>> to be working.
>>>
>>> Our company uses a job scheduling app called Tidal. The manager of that
>>> app demo'd the product to me and it seems like it can address many of our
>>> headaches. In theory a single simple interface to manage all the jobs
>>> scheduled across all the database servers.
>>>
>>> However one of the issues identified by the Linux admin is that the
>>> Tidal agent needs root access so he is reluctant to install the Tidal agent
>>> anywhere but a couple of designated Tidal servers.
>>>
>>> I am wondering if other sites have stopped using crontab? If so then
>>> what did you replace it with?
>>>
>>> Anyway, I am open to any thoughts, suggestions, etc.
>>>
>>> Thanks,
>>> Chris Grabowy
>>>
>>>
>>> --
>>> http://www.freelists.org/webpage/oracle-l
>>>
>>>
>>>
>> Chris, I am not sure why are you using crontab with NetBackup? NetBackup
>> has its own scheduler and can schedule the scripts centrally, through the
>> NB GUI. All you need is the right script in /usr/openv/netbackup/bin and
>> all will be well.
>>
>> As for Tidal, I have no experience whatsoever with the product, but I do
>> have an experience with the competing product called Control-M from BMC.
>> Unfortunately, all 3rd party scheduling products, including NetBackup which
>> also has a centralized scheduler, must have a service which runs as user
>> "root". The reason for that is that they have to be able to switch user and
>> run something as user "oracle", without being prompted for password every
>> time they need to run a job.
>>
>> These products are usually installed as Linux service, in /etc/init.d
>> directory and are started during the Linux start-up. Please, inform your
>> system administrator that NetBackup requires root access as well and ask
>> him to remove it from all the systems for security reasons. Why stop there?
>> Oracle also requires root access in the installation phase, one must run
>> orainstRoot.sh and $ORACLE_HOME/root.sh as user root, so your system
>> administrator should remove that, too. God forbid you have ASM, that
>> requires root access, too. To further secure your systems, after removing
>> all 3rd party products, including Oracle, he or she should execute "service
>> network stop" as user "root" on every system.
>> That would completely secure your Linux system and make it impossible to
>> anyone without the physical access to the server to use them. Of course, no
>> security is complete without physical security, so you should consider the
>> industry standard security measures like barbed wire, mine fields,
>> electrified fence, guard dogs, machine gun nests and Chuck Norris.
>>
>> Long story short, you are dealing with an unreasonable system
>> administrator. It's not your problem, it's a problem for your boss.
>> Management decides what runs on the company systems, not the system
>> administrator. There is one piece of trivia which people frequently forget:
>> the only completely secure systems are systems that are not being used and
>> contain no useful information. Refusing to install an established 3rd party
>> product based on the assessment that it "needs root access" is ludicrous,
>> to put it mildly.
>>
>> --
>> Mladen Gogala
>> Oracle DBA
>> http://mgogala.freehostia.com
>>
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>
-- http://www.freelists.org/webpage/oracle-l
Received on Mon Apr 13 2015 - 04:11:41 CEST

Original text of this message