RE: OT: bash vs. ksh subprocess counts

From: Herring Dave - dherri <Dave.Herring_at_acxiom.com>
Date: Tue, 24 Apr 2012 14:02:33 +0000
Message-ID: <BD475CE0B3EE894DA0CAB36CE2F7DEB4455DAAC2_at_LITIGMBCRP02.Corp.Acxiom.net>



Thx Jared - I appreciate the followup and sharing different methods! I meant to initially share that the original solution is flawed and my intent is not to look for better methods of doing this but more for understanding details about the differences in ksh and bash with this particular situation. As others have suggested, I removed the count ("-c") from the if-test just to see better what's going on.

In bash the commands:

set -x
ps -ef|grep -w $0
if [ "`ps -ef|grep -w $0`" > "3" ]; then

   exit 1
fi

... display:

+ ps -ef
+ grep -w ./bash.sh

oracle 24865 28645 0 09:33 pts/0 00:00:00 /bin/bash ./bash.sh oracle 24869 24865 0 09:33 pts/0 00:00:00 grep -w ./bash.sh

++ ps -ef
++ grep -w ./bash.sh

+ '[' 'oracle 24865 28645 0 09:33 pts/0 00:00:00 /bin/bash ./bash.sh
oracle 24870 24865 0 09:33 pts/0 00:00:00 /bin/bash ./bash.sh oracle 24872 24870 0 09:33 pts/0 00:00:00 grep -w ./bash.sh' ']'
+ exit 1

In ksh the same commands display:

+ ps -ef
+ grep -w ./ksh.sh

oracle 24954 28645 0 09:33 pts/0 00:00:00 /bin/ksh ./ksh.sh oracle 24956 24954 0 09:33 pts/0 00:00:00 grep -w ./ksh.sh
+ ps -ef
+ grep -w ./ksh.sh
+ [ oracle 24954 28645 0 09:33 pts/0 00:00:00 /bin/ksh ./ksh.sh
oracle 24958 24954 0 09:33 pts/0 00:00:00 grep -w ./ksh.sh ]
+ > 3
+ exit 1

So it appears that bash forks an extra process for a command stream within an if-test while ksh doesn't. Replacing the command stream with a literal the 2 shells match up.

DAVID HERRING
DBA
Acxiom Corporation
EML dave.herring_at_acxiom.com
TEL 630.944.4762
MBL 630.430.5988
1501 Opus Pl, Downers Grove, IL 60515, USA WWW.ACXIOM.COM<http://www.acxiom.com/>

[Friend Us on Facebook]<http://www.facebook.com/acxiomcorp> [Link Us on LinkedIn] <http://www.linkedin.com/groupRegistration?gid)01735> [Follow Us on Twitter] <http://twitter.com/acxiom>

[cid:image004.png_at_01CB84F1.26214350]



The information contained in this communication is confidential, is intended only for the use of the recipient named above, and may be legally privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please resend this communication to the sender and delete the original message or any copy of it from your computer system. Thank you.

From: Jared Still [mailto:jkstill_at_gmail.com] Sent: Friday, April 20, 2012 11:55 AM
To: Herring Dave - dherri
Cc: oracle-l_at_freelists.org
Subject: Re: OT: bash vs. ksh subprocess counts

On Sun, Apr 15, 2012 at 10:45 AM, Herring Dave - dherri <Dave.Herring_at_acxiom.com<mailto:Dave.Herring_at_acxiom.com>> wrote: I realize this is way off topic (probably should be titled "WOT:...") for Oracle but #1, I don't currently belong to any good linux/redhat forums and #2, the issue found was from an Oracle maint. script :-)

WOT = Wide Open Throttle

Sometimes applies to Oracle I guess.

Anyway, I'm on a RHEL 4.6 server (2.6.9-67.0.22.ELlargesmp) and noticed a given DBA maint script wasn't running. It turns out that the only difference between when it last ran and now is someone changed the shell to be "bash" -> #!/bin/bash vs. #!/bin/ksh. The script has a little if-test to start with, used as a way to determine if a previous iteration of the job is still running:

if [ `ps -ef|grep -cw $0` -gt 3 ]; then
  echo "$0 is already running"
  exit 2
fi

There are better methods for doing this.

Following is a function that creates a PID file and uses it for locking. Stale locks are handled.

a possible issues with this - not useful as is on cluster, as other nodes could run the script.

Anyway, it may be of some use.

Jared Still
Certifiable Oracle DBA and Part Time Perl Evangelist Oracle Blog: http://jkstill.blogspot.com Home Page: http://jaredstill.com
==== script =====

9:49-poirot:ts20:jkstill-22 > expand -t3 locktest.sh :

LOCKFILE=/tmp/testlock.lock

function script_lock {

   typeset MY_LOCKFILE
   MY_LOCKFILE=$1    # remove stale lockfile
   [ -r "$MY_LOCKFILE" ] && {

      PID=$(cat $MY_LOCKFILE)
      ACTIVE=$(ps --no-headers -p $PID)
      if [ -z "$ACTIVE" ]; then
         rm -f $MY_LOCKFILE
      fi

   }

   # set lock

   if (set -o noclobber; echo "$$" > "$MY_LOCKFILE") 2> /dev/null; then

      trap 'rm -f "$MY_LOCKFILE"; exit $?' INT TERM EXIT
      return 0
   else
      echo "Failed to acquire $LOCKFILE. Held by $(cat $LOCKFILE)"
      exit 1

   fi
}

function script_unlock {

   rm -f "$LOCKFILE"
   trap - INT TERM EXIT
}

script_lock $LOCKFILE

echo press '<ENTER>...'
read dummy

script_unlock

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Apr 24 2012 - 09:02:33 CDT

Original text of this message