leponceau.org

Programming And Stuff, You Know The Thing…

A Practical BASH Guide

Posted at — Feb 24, 2006

Table of Contents

Warning

When running scripts with “sh” instead of “bash”, some extensions will be disabled!

Useful Additional Shell Commands

I maintain a repository of BASH extensions at https://github.com/jjYBdx4IL/snippets/tree/master/bash. Activate them by adding . /path/to/checkout/source_inc.sh to your shell startup file, ie. ~/.bashrc. An overview of the available commands is available any time through BASH’s autocompletion feature: all extension commands start with underscore (_) and the first part defines the group, ie. _bash_* for BASH related stuff, _mvn_* for maven related commands etc. Often, there is a _*_summary command that will display a list of useful shortcuts.

Editing Shell Scripts Using VI/VIM

I’m always using the following ~/.vimrc, it enables the F2 hotkey to toggle paste mode.

## ~/.vimrc
    set modelines=20
    set modeline
    syntax on
    set backspace=2
    set shiftwidth=4 tabstop=4 expandtab ai smartindent fileformat=unix fileencoding=utf-8
    " Uncomment the following to have Vim jump to the last position when
    " reopening a file
    if has("autocmd")
          au BufReadPost * if line("'\"") > 0 && line("'\"") <= line("$")
              \| exe "normal! g'\"" | endif
              endif
              nnoremap <F2> :set invpaste paste?<CR>
              set pastetoggle=<F2>
              set showmode
              set number
              " colo slate
              colo elflord

Skeleton For New Shell Scripts

#!/bin/bash
# vim:set sw=4 ts=4 et ai smartindent fileformat=unix fileencoding=utf-8 syntax=sh:
set -Eex ; set -o pipefail
export LANG=C LC_ALL=C
scriptdir=$(readlink -f "$(dirname "$0")")
cd $scriptdir

Basic Operators

$ EMPTY=""
$ FULL="0"
$ echo ${EMPTY:-A}
A
$ echo ${EMPTY:+A}

$ echo ${FULL:-A}
0
$ echo ${FULL:+A}
A

# useful to define defaults:
TIMEOUT=${param:-60}
# or to add separation chars:
PATH="/usr/local/bin${PATH:+:}$PATH"

Predefined Variables

Arrays

a[5]=0
echo ${a[5]}
echo ${a[*]} # prints all contents

# note that there are two types of the latter expansion:
echo "${a[*]}" # expands to one argument to echo
echo "${a[@]}" # expands to as many arguments as the array has entries

Some Tricks

Arrays / (some sort of) Hashes

get_vmidx() {
    local vmname="$1"
    if [[ $vmname =~ ^[0-9]+$ ]]; then
        echo $vmname
        return 0
    fi
    
    local nvms=${#vmdefs[*]}
    local i
    local name
    for (( i=0; i<nvms; i++ )); do
        eval $vmdefs[$i]
        if [[ "$name" == "$vmname" ]]; then
            echo $i
            return 0
        fi
    done
    echo "VM not defined: $vmname"
    exit 1
}

install_vm() {
    local idx=$(get_vmidx "$1")
    eval ${vmdefs[$idx]}
    ...
}

definevm() {
    local name
    local ova
    local ip
    local mb
    local vrdeport
    local cpus
    local idx=${#vmdefs[*]}
    vmdefs[$idx]="$*"
    echo $idx
    return 0
}

Type Checking / Checking Function Existence

if [[ $(type -t my_configure) == "function" ]]; then
    my_configure
else
...

Accessing Function Arguments

# iteratively:
for arg in "$@"; do
  echo $arg
done

# or:
while [[ $# -gt 0 ]]; do
  echo $1 # or echo ${1}
  shift
done

# selectively:
echo ${3}

# selectively through a variable:
let "a=3"
echo ${!a} # equals $3

Returning Values from Functions

#!/bin/bash

testfunc() {
        echo "$1"
}

res=$(testfunc teststring)
echo "res=$res"

BASH Command History Or How to Make Your Life Easier on the Command Line

To access a (preferably very long) command in your BASH command history buffer, just enter the first few characters and use the PgUp and PgDn keys. Or press “CTRL-r” to search it.

Additionally, you may want to add “export HISTCONTROL=ignoredups” to your .bashrc and/or .bash_profile files.

Enlarging the command history buffer by setting the variables HISTSIZE and HISTFILESIZE may also be quite helpful.

Example ~/.bashrc:

export LANG="de_DE.utf8"
export LC_ALL="de_DE.utf8"
export INPUTRC=~/.inputrc

alias d="ls --color"
alias ls="ls --color=auto"
alias ll="ls --color -l"
alias la="ls --color -lah"
alias lr="ls --color -lrth"

alias vi=vim

shopt -s histappend
export HISTCONTROL=erasedups
export HISTIGNORE="ll:ls:la:cd*:[bf]g*:exit"

export EDITOR=vim

alias svngrep="find . -not -path '*/.svn/*' -and -not -name '*~' -type f -print0 | xargs -0 grep -i --regexp"

Example ~/.inputrc to enable history browsing using the PgUp and PgDown keys:

"\e[5~": history-search-backward
"\e[6~": history-search-forward

Escaping Characters

 # simple examples
 VAR=""     # VAR is empty
 VAR='"'    # VAR is "
 VAR="\""   # VAR is "
 VAR="'"    # VAR is '
 VAR="""    # error! expression not yet finished...
 VAR='\"'   # VAR is \" (escape sequences don't work inside '...', only inside double quotes)

 # not that simple examples
 FILENAME="file with spaces in its filename.txt"
 LINECOUNT=`cat $FILENAME | wc -l`        # doesn't work, 'cat' searches for six different files
 LINECOUNT=`cat "$FILENAME" | wc -l`      # works
 LINECOUNT="`cat "$FILENAME" | wc -l`"    # works
 LINECOUNT="`cat \"$FILENAME\" | wc -l`"  # works
 LINECOUNT='`cat "$FILENAME" | wc -l`'    # doesn't work, `` don't get evaluated
 LINECOUNT="`cat '$FILENAME' | wc -l`"    # doesn't work, variable not substituted by its value
 LINECOUNT='`cat '$FILENAME' | wc -l`'    # doesn't work, `` don't get evaluated

Recording Program Output in Variables

 # stores date string given by the 'date' prog in $VAR
 VAR=`date` # won't catch error messages!

Redirection of Program Output

Note: there are various output streams attached to the console. Usually, there are two: standard output and error output. In the C++ programming language, one may write to the standard output using the expression 'std::cout <<some regular message;', whereas output to the error output can be done with 'std::cerr <<some error message;'. In BASH the standard output has been assigned the number ‘1’, whereas for the error output we have the number ‘2’.

 # suppression of error messages
 ls /some_not_existent_dir 2> /dev/null
 # recording of error messages
 cmd 2> errorlog.txt
 # recording of standard output and error output
 cmd > log.txt 2> errorlog.txt
 # same
 cmd 1> log.txt 2> errorlog.txt
 # catch error messages in variable by rediration of error output to standard output
 VAR=`ls /not-exist 2>&1`
 # record standard output and error output into same file
 cmd 2>&1 > log.txt
 # append to file 
 cmd 2>&1 >> log.txt
 # look at a 'configure' script included in most open-source source code packages
 # to see extreme usage of BASH command syntax...

Analyzing Pipes

strace -f -e execve,close,dup2,open,fork bash -c 'exec >&3'

or better:

#!/bin/bash

set -v

# initial fd states 
ls -l /proc/$$/fd/ >&2

# secure STDOUT destination to fd 4
exec 4>&1
# and then redirect STDOUT to /dev/null
exec 1>/dev/null

# hidden from STDOUT
echo asd

# post-/dev/null-redirection fd states
ls -l /proc/$$/fd/ >&2

# copy fd 4 destination back to fd 1
exec 1>&4
# and close fd 4
exec 4>&-

# displayed via STDOUT
echo 123

# restored fd states
ls -l /proc/$$/fd/ >&2

Advanced Pipes

#!/bin/bash

RESULT=""

die() {
    echo "error $1"
    echo "died in $BASH_SOURCE at line $BASH_LINENO"
    exit 1
}

get_int() {
    local title="$1" ret
    # open fd 3 and let it redirect input to stdout
    exec 3>&1
    RESULT=`dialog --inputbox "$title" 0 0 2>&1 1>&3`
    ret=$?
    # close fd 3
    exec 3>&-
    return $?
}

get_int "Enter SE Linux SVN release tag:" || die
echo "Input string is $RESULT"

Storing Text Files in Scripts

One may do that by using ‘echo’:

 echo "first line" > file
 echo "second line" >> file
 ...

Obviously, that way is quite ugly and very insecure. A much better solution is:

 cat > file << EOF
 first line
 second line
 ...
 EOF

Here, ‘EOF’ is the end-of-file keyword. Writing to ‘file’ will end when a line is encountered that contains just that keyword. Do not add additional spaces onto that line! The line must exactly match the keyword. A great feature is that even shell variables get evaluated that way! Start them with ‘$’ to prevent evaluation!

Checking File Types

if [ -f some_script.sh ]; then ...; fi

 # create symbolic link link.txt pointing to ../somedir/regular_file.txt
 # order of arguments: remember as 'copy as symbolic link'
 # btw: you may 'copy' whole directory trees using symlinks by doing 'cp -Rs src tgt'.
 ln -s ../somedir/regular_file.txt link.txt 
 readlink link.txt    # prints ../somedir/regular_file.txt
 readlink -f link.txt # prints /home/me/somedir/regular_file.txt (full absolute path)

Handling Path Information

 SCRIPTNAME=`basename $0`         # without directory information
 RELATIVE_SCRIPT_DIR=`dirname $0` # directory information only
 pwd                              # prints full name of current dir
 pushd some_other_dir             # change to some_other_dir, but push current on dir stack
 dirs                             # prints dir stack
 popd                             # pops dir from dir stack and changes to it

Chaining Command Execution

Often one wants to make sure a command definitely succeeds and abort otherwise. Here is how to do that nicely:

 # quick solution for a small command chain
 cmd1 && cmd2 && cmd3
Here, execution is done like one would expect it from a regular programming language: evaluate first expression, if that succeeds, then evaluate the second etc.
 # nicer
 cmd1 && cmd2 && cmd3 && echo "OK" || echo "ERROR"
Practical example:
 # compile the Linux kernel
 cd linux-2.6.12.4
 make menuconfig
 make && make modules_install && make install && shutdown -h now
 # go to bed, don't wait and enter all commands in time...

The ‘&&’ operators have higher precedence, so ‘ERROR’ will only be printed to the console if one of the three (actually four, but we don’t expect ‘echo’ to fail) commands fail. For longer command chains, one may do it like:

 cmd1 || exit 1
 cmd2 || exit 2
 cmd3 || exit 3
 ...

That way, the error return code in ‘$?’ will indicate the command that failed.

Mastering Time

# remember current time (changes file's last-modified time to current time)
touch some_filename
# get file's (last-modified) time
stat -c %Y some_filename
# get current time
date +%s
# is a file older than one hour?
filetime=$(stat -c %Y filename)
currtime=$(date +%s)
let "age = ( currtime - filetime ) / 3600"
(( age > 1 )) && {
  # do something not more often than each hour
  ...
  touch filename
}
# or (much) simpler:
[[ $file1 -ot $file2 ]] && ... # older than
[[ $file1 -nt $file2 ]] && ... # newer than
# get a human-readable date string from UNIX time since the epoch
date -d @946684800 +"%F %T %z"

File System Space, free and used

Hint: use “stat” to gather not only information about files, but also about file systems! It is much better than using “df” because you can control stat’s output and therefore reliably parse it (or you don’t even need to parse it).

Variables and simple pattern matching

 set             # list all defined variables and their values
 echo $VARNAME   # print value of VARNAME
 echo ${VARNAME} # same effect

 # assign value, note that there's no '$'
 VARNAME="a:list:of:words"
 echo ${VARNAME#*:}  # gives 'list:of:words'
 echo ${VARNAME##*:} # gives 'words' (greedy version)
 echo ${VARNAME%:*}  # gives 'a:list:of'
 echo ${VARNAME%%:*} # gives 'a'

 # replace substrings
 echo ${VARNAME/list/otherlist} # replace first occurrence of "list" by "otherlist"
 echo ${VARNAME//list/otherlist} # replaces all occurrences

 # empty $LIST
 LIST=
 LIST2=
 for NEW in one two three; do
   LIST=${LIST:+${LIST}_}${NEW}
   LIST2=${LIST2:-${LIST2}_}${NEW}
 done
 echo $LIST  # gives 'one_two_three'
 echo $LIST2 # gives '_onetwothree'

 # substring extraction
 STR=${STR:0:2} # gets first two characters

 # get current time encoded as YYYYMMDDhhmmss
 TIMETAG=`date +%Y%m%d%H%M%S`
 # save time
 echo $TIMETAG > last-script-execution-time.txt

Loops

 COUNT=0
 while [ $COUNT != 10 ]; do
   ((COUNT++))
 done
 echo $COUNT # gives '10'

 A="one two three"
 for C in $A; do echo $C; done
 # code block of the loop is executed 3 times!

That way we cannot loop over values containing spaces! That is problematic in cases where we want to run over filenames containing spaces… for that case, just do:

ls * | while read F; do echo $F; done

Here, we repeat the loop line for line.

case-Statement

my_unpack() {
    case "$1" in
        *.tar.gz|*.tgz)
            tar -xzf "$1"
            ;;
        *.tar.bz2|*.tbz|*.tbz2)
            tar -xjf "$1"
            ;;
    esac
}

Regular Expressions

Regular expressions are available on the command line through utilities like sed, awk, grep, and perl:

 echo asdas | sed 's/^as/b/'  # gives bdas
 echo asdas | sed 's/^das/b/' # gives asdas
 echo asdas | sed 's/das/b/'  # gives asb
 echo asdas | sed 's/as/b/'   # gives bdas
 echo asdas | sed 's/as/b/g'  # gives bdb

In ’s/as/b/g', ’s' denotes substitution, ‘as’ the search pattern expression, ‘b’ the replacement for a possibly matched search pattern, and ‘g’ an option: g=global, ie. replace every occurrence, i=case insensitive. Note that sed is greedy, ie. it tries to match as much as possible:

 echo 123454321 | sed 's/^.*3//' # gives '21', and not '454321'
 echo 123454321 | sed 's/3.*$//' # gives '12', and not '123454'

The search pattern is a so-called ‘regular expression’. Here are a few comments on how to build such search patterns:

 # use grep to strip comments and blank lines from a file:
 grep -v '\(^#\|^ *$\)' filename

Note that we may only match space here, so lines containing other types of whitespace (tab) won’t be recognized as blank lines. For more flexible patter matching use perl:

 # use perl to strip comments and blank lines:
 cat filename | perl -ne '/(^#|^\s*$)/ or print'
'\s' denotes all whitespace, incl. tabs.
 # a more general way to remove comments:
 cat filename | perl -ne '/^\s*#/ or do { s/#.*$// ; print }'

Of course, that will fail as soon as a ‘#’ occurs outside the comments. Perl’s regular expression pattern matching is quite powerful:

 cat /proc/cpuinfo | perl -ne 's/^cpu MHz\s*:\s*([^\s]+)\s*$/$cpuspeed=$1/e and print $cpuspeed'

Locking, Signals etc.

 # remove lockfile when stopped
 trap "rm -f ${LOCKFILE}" 0 1 2 3 15

The util-linux package contains a tool called “flock”. It allows to serialize the execution of BASH scripts. For example, one could start a daemon written in BASH with the command line

flock --timeout=0 su - daemon-user -c daemon.sh

The daemon script would then check whether another instance of it is already running. So far, so nice. But there is a huge caveat (maybe not that much for that specific example): flock’s locking is implemented in such a way that the lock is retained until “flock” *and* the daemonized process quit! For example, if you rely on daemon.sh doing something – no matter whether it will succeed in starting a daemon instance – flock will prevent you from doing that more than once (as long as your daemon is holding the first lock in the background!). Personally, I use a little hack: I test if the lockfile is still locked after returning from flock (using flock again), and if it is, I just delete it.

simple locking examples

http://stackoverflow.com/questions/8866175/locking-in-bash-again-how-to-prevent-lock-propagation/

# don't delete the lock file! ever!
set -eE
exec 9>>lockfile
# blocking, childs will hold the lock, too
flock 9
# non-blocking, childs will hold the lock, too
flock -n 9 
# blocking, childs won't hold the lock
flock -o 9
# non-blocking, childs won't hold the lock
flock -n -o 9 

Examples

Run only once a day

In order to prevent running portions of a script more than once a day (ie. ‘emerge –sync’), one could do:

 TIMETAG=`date +%Y%m%d`
 TIMETAGFILE=~/.emerge_sync_timetag
 # check if emerge --sync has not been run succuessfully today yet
 OLDTAG=`cat "$TIMETAGFILE"`
 if [ "$OLDTAG" != "$TIMETAG" ]; then
   emerge --sync && echo "$TIMETAG" > "$TIMETAGFILE"
 fi
 # download updated/new packages first
 emerge --update --deep --fetchonly world || emerge --update --fetchonly world
 # then build and install them
 emerge --update --deep world || emerge --update world
 # update ebuild db
 eupdatedb
 # update configuration files
 etc-update
 echo "All done. Press ENTER."
 read # run in an xterm, so we don't want to let it go away too fast

Advanced File Processing

The following does not spawn an extra process! Changes to variables done inside the loops will remain.

while read line
do
  for word in $line
  do
    # do something nasty...
    # BEWARE: the ssh program terminates standard input
    # and would thus interfere with the input of the outer
    # loop if we would not attach its input to /dev/null!
    ssh user@host "cmd" </dev/null || die "error!"
  done
done < "$INPUTFILE"

Error Handling

#!/bin/bash

die() {
  echo "Died in line $BASH_LINENO in script $BASH_SOURCE."
}

if ... die

Usually, one may infer from the exit code where some bash script has failed. However, one has to keep the exit codes unique and that is sometimes a rather tedious task. So, if you do not need specific exit codes for calling programs (which most often only check if your script was successful or not), just using a die function like the one depicted above is not only easy to use, but also much nicer and contains more information.

Note: that does not work with BASH version 2.05, but with version 3.x. The older bash versions only support the LINENO variable – but that one only shows the current line number, not the line number of the calling method (it would return 7 instead of 4 in the previous example code).

Redirection Tricks

Source: http://www.faqs.org/docs/Linux-mini/Remote-X-Apps.html

#!/bin/sh

if [ $# -lt 2 ]
then echo "usage: `basename $0` clientuser command" >&2
     exit 2
fi

CLIENTUSER="$1"
shift

# FD 4 becomes stdin too
exec 4>&0

xauth list "$DISPLAY" | sed -e 's/^/add /' | {

    # FD 3 becomes xauth output
    # FD 0 becomes stdin again
    # FD 4 is closed
    exec 3>&0 0>&4 4>&-

    exec su - "$CLIENTUSER" -c \
         "xauth -q <&3
          exec env DISPLAY='$DISPLAY' "'"$SHELL"'" -c '$*' 3>&-"

} 

Threading

# note: you may pause and restart processes by sending them
# the STOP and CONT signals (using kill or killall). A list
# is obtained using "ps".
{
  some code
} &
# or with a separation of the execution block from the rest of the script
(
  ...
) &

Exit Codes in Pipes

root@voyager ~ # set +o pipefail                # default behaviour
root@voyager ~ # ls /non-existing-dir | cat
ls: Zugriff auf /non-existing-dir nicht möglich: Datei oder Verzeichnis nicht gefunden
root@voyager ~ # echo "${PIPESTATUS[*]} = $?"
2 0 = 0
root@voyager ~ # set -o pipefail
root@voyager ~ # ls /non-existing-dir | cat
ls: Zugriff auf /non-existing-dir nicht möglich: Datei oder Verzeichnis nicht gefunden
root@voyager ~ # echo "${PIPESTATUS[*]} = $?"
2 0 = 2

Easy Error Checking/Handling / Reference BASH Script Header

# abort script whenever a command fails (returns with non-zero exit code):
set -e
# show everything BASH currently processes in your script:
set -v

I usually use the following header in my BASH scripts (especially when running them as cron jobs):

set -e || exit 1
umask 077
exec >> /var/log/progname.log
exec 2>&1
set -v
set -x
set -o pipefail

Process Information

If you want to run lower-prioritized background tasks, you may want to check whether you are already running at a low-priority level (there is a maximum on niceness…):

mynicelvl=$(ps h -o %n $$)

Job Control / Background Tasks / Parallel Execution

BASH provides some convenience methods to run tasks in parallel. However, there may be some caveats. Example:

ddcmd="nice -n 20 dd if=/dev/urandom of=/dev/null bs=1M"
# start jobs in background
$ddcmd > log1 &
$ddcmd > log2 &
# stop em -- NOT working that way...
kill -INT "%$ddcmd > log1"
kill -INT "%$ddcmd > log2"
# stop em -- working...
kill -INT "%\$ddcmd > log1"
kill -INT "%\$ddcmd > log2"

Parallel Execution Via Workers (Run N Processes in Parallel)

# run 3 bzip2 processes in parallel over any number of input files
ls * | xargs -d \\n -P 3 -n 1 bzip2 -9v
# same
find * -print0 | xargs -0 -P 3 -n 1 -I '{}' bzip2 -9v '{}'

Terminal Escape/Control Sequences

Auto-Logout on Console

# add to /etc/profile:
if [ "x$DISPLAY" == "x" ]; then
    TMOUT=600
fi

GUI Tools

parallel shell scripting

Bugs??

user@i5 ~ $ echo $(echo -n "START"; set -e || exit 1; set -E || exit 2; afunc() { echo AAA | { while read l; do false; echo "SOME"; done } }; afunc ; true)
START
user@i5 ~ $ echo $(echo -n "START"; set -e || exit 1; set -E || exit 2; afunc() { echo AAA | { while read l; do false; echo "SOME"; done } }; afunc ; true) || :
STARTSOME