\documentclass{ictlab} \RCS $Revision: 1.10 $ \usepackage{alltt,key,xr,cols} \externaldocument[lt-]% {../../linux_training-plus-config-files-ossi/build/masterfile} \usepackage[pdfpagemode=None,pdfauthor={Nick Urbanik}]{hyperref} \newcommand*{\labTitle}{Shell Programming---an Introduction} \renewcommand*{\subject}{Operating Systems and Systems Integration} \providecommand*{\RPM}{\acro{RPM}\xspace} \providecommand*{\CD}{\acro{CD}\xspace} \begin{document} %\Large \tableofcontents \section{Aim} \label{sec:aim} After successfully working through this exercise, You will: \begin{itemize} \item write simple shell scripts using \texttt{for}, \texttt{if}, \texttt{while} statements; \item understand basic regular expressions, and be able to create your own regular expressions; \item understand how to execute and debug these scripts; \item understand some simple shell scripts written by others, and \item be ready to begin to perform automated editing of configuration files using \texttt{sed} \item be ready to begin customizing an installation using an automated installation method called \texttt{kickstart}, which will be our next laboratory topic. \end{itemize} \section{Background} \label{sec:background} A working knowledge of shell scripting is essential to everyone wishing to become reasonably adept at system administration, even if they do not anticipate ever having to actually write a script. Consider that as a Linux machine boots up, it executes the shell scripts in \texttt{/etc/rc.d} to restore the system configuration and set up services. A detailed understanding of these startup scripts is important for analyzing the behaviour of a system, and possibly modifying it. Writing shell scripts is not hard to learn, since the scripts can be built in bite-sized sections and there is only a fairly small set of shell-specific operators and options to learn. The syntax is simple and straightforward, similar to that of invoking and chaining together utilities at the command line, and there are only a few ``rules'' to learn. Most short scripts work right the first time, and debugging even the longer ones is straightforward. A shell script is a ``quick and dirty'' method of prototyping a complex application. Getting even a limited subset of the functionality to work in a shell script, even if slowly, is often a useful first stage in project development. This way, the structure of the application can be tested and played with, and the major pitfalls found before proceeding to the final coding in C, \Cpp, Java, or Perl. Shell scripting hearkens back to the classical UNIX philosophy of breaking complex projects into simpler subtasks, of chaining together components and utilities. \subsection{Where to get more information} \label{sec:references} There is a free on-line book about shell programming at: \url{http://www.linuxdoc.org/LDP/abs/html/index.html} and \url{http://www.linuxdoc.org/LDP/abs/abs-guide.pdf}. The handy reference to shell programming is: \begin{verbatim} $ pinfo bash \end{verbatim}%$ or \begin{verbatim} $ man bash \end{verbatim}%$ \section{The Shebang} \label{sec:shebang} A shell script is started by the Linux kernel. The kernel reads the first two bytes of the executable file to determine how to execute it. If it starts with the characters ``\texttt{\#!}'' then the kernel will consider this to be executed as a script, run by an interpreter. The kernel then reads the next characters after the ``\texttt{\#!}'' to determine what interpreter to use. For shell scripts, the interpreter is \texttt{/bin/sh}, so the first line of all our shell scripts is: \begin{verbatim} #! /bin/sh \end{verbatim} If you make any typing mistake in the name of the interpreter, you will get an error message such as ``bad interpreter: No such file or directory.'' \section{Making the script executable} \label{sec:chmod+x} To easily execute a script, it should: \begin{itemize} \item be on the \texttt{PATH} \item have execute permission. \end{itemize} How to do each of these? \begin{itemize} \item Red Hat Linux by default, includes the directory $\sim$\texttt{/bin} on the \texttt{PATH}, so create this directory, and put your scripts there. \item If your script is called \texttt{script}, then this command will make it executable: \begin{verbatim} $ chmod +x script \end{verbatim}%$ \end{itemize} \section{True and False} \label{sec:true-and-false} Shell programming uses external programs very much. When program execution is successful, programs have an \emph{exit status} of 0, and a non-zero error code when not successful. As a result, shell programming uses the value 0 as true, and non-zero as false. \section{Shell Variables} \label{sec:variables} When using the value of a variable, the variable starts with a dollar sign `\texttt{\$}' When assigning a value to a variable, the variable has no dollar sign. An assignment has no spaces either side of the `\texttt{=}': \begin{verbatim} a=375 hello=$a PATH="$PATH:/sbin:/usr/sbin" \end{verbatim} \subsection{Baby Can't Change Parent} \label{sec:baby-cant-change-parent} Nonsense, any parent will tell me. Okay, I'm talking about processes, not humans. When a parent process \texttt{fork()}s and has a child process, the child process inherits all the environment variables of the parent. But the child process cannot change any environment variable of the parent. If you write a shell script that sets some environment variables and then exits, you will find that all these new values have disappeared. This applies to subshells too, so values set in a subshell are ``local''. To execute some commands in a subshell, put parentheses around them. See the example in section~\ref{sec:special-variables}. Here is an example of what I am talking about: \begin{verbatim} $ echo $HOME /home/nicku $ pwd $ cat baby #! /bin/sh cd /usr HOME="Tsing Yi" echo $HOME pwd $ ./baby Tsing Yi /usr $ echo $HOME /home/nicku $ pwd /home/nicku/teaching/ict/ossi/lab/shell \end{verbatim}%$ \section{Special Variables} \label{sec:special-variables} Parameters may be passed to a shell script. `\texttt{\$0}' is the name of the shell script itself. The first parameter is called `\texttt{\$1}', the second is `\texttt{\$2}' and so on. The number of parameters is `\texttt{\$\#}'. A list of all the parameters is in the variables `\texttt{\$*}' and `\texttt{\$@}'. The only difference between `\texttt{\$*}' and `\texttt{\$@}' is when they are enclosed in double quotes---see section \vref{pag:dollar-star-quoting} on quoting. \texttt{IFS} is the ``\emph{internal-field separator}''. The shell automatically splits strings into fields divided by the \texttt{IFS}\@. Here is a simple example: \begin{verbatim} $ (IFS=:; echo $PATH) /usr/kerberos/bin /usr/local/bin /usr/bin /bin /usr/bin/X11 /usr/games /usr/bin /usr/X11R6/bin /opt/OpenNMS/bin /usr/java/jdk1.3.1_01/bin /home/nicku/bin /sbin /usr/sbin /usr/local/sbin \end{verbatim} I changed \texttt{IFS} in a subshell so that the value of \texttt{IFS} in the current shell would not be changed. Sort of like a local variable. \section{Special Characters} \label{sec:special-characters} Comments start with a `\texttt{\#}'. Statements are separated either by newlines, or by semicolons `\texttt{;}'. The dot command is useful for executing a login script: \begin{verbatim} . ~/.bash_profile \end{verbatim} It is useful here, because it does not execute the commands in a separate subshell. Hence, all changes to variables remain. The `\texttt{\$}' symbol indicates that a variable name comes next, and gives the value of that variable. The backslash \texttt{"\bs"} has many meanings, mostly similar to it's behaviour in the C programming language. At the end of a line, a backslash allows a long line to be split into shorter pieces. There are many other characters that are special to the shell. See chapter~4 of \url{http://www.linuxdoc.org/LDP/abs/html/index.html}. \section{Quoting} \label{sec:quoting} There are four main ways of quoting: forward single quotes, double quotes, the backslash, and backward single quotes. Quoting causes the quoted material to have a different meaning from normal. In particular, the special treatment the shell gives to special characters is suppressed to some degree. Enclosing in double quotes \texttt{"..."} suppresses all special behaviour, except for variable interpretation (\texttt{\$}), the forward quote, and the backslash. Enclosing in single forward quotes \texttt{'...'} suppresses the special behaviour of all special characters. Putting a backslash in front of a character preserves the literal value of the character, except for newline. Single back quotes \texttt{`...`} mean: ``execute the external program called within these quotes and put the output back here.'' This is called \emph{command substitution} in the bash manual. Command expansion is really quite different from the other three quoting methods. Here is an example using the \texttt{hostname} command, which prints the hostname on standard output: \begin{verbatim} $ hostname nickpc.tyict.vtc.edu.hk $ h=hostname $ echo $h hostname $ h=`hostname` $ echo $h nickpc.tyict.vtc.edu.hk \end{verbatim}%$ \subsection{When to use quoting} \label{sec:when-to-quote} Many programs, such as \texttt{grep} or \texttt{find} need some special characters that they themselves will interpret. We need to be able to send these characters unchanged to the program. In this case, quote them. Examples: \begin{verbatim} $ find . -name "*.rpm" \end{verbatim}%$ If we do not quote the asterisk, the shell will expand \texttt{*.rpm} to match only the \texttt{rpm} files in the current directory, but we want find to locate all the \texttt{.rpm} files in the directories \emph{below} the current directory also. If you want a variable value that contains spaces to not be automatically split my the shell, then quote it. Here, \texttt{testquote} is a short shell script that prints information about its parameters: \begin{verbatim} $ test="one two" $ testquote $test You have 2 parameters. They are: parameter 1: one parameter 2: two $ testquote "$test" You have 1 parameters. They are: parameter 1: one two \end{verbatim}%$ \label{pag:dollar-star-quoting}% Note that \texttt{"\$*"} is one value (not split up), while \texttt{"\$@"} is split into the original parameters. So if `\texttt{\$\#}' had the value 4, then there are four separately quoted values in \texttt{"\$@"}. See the beginning of section~\vref{sec:special-variables}. Here is a little example showing the difference between \texttt{"\$@"} and \texttt{"\$*"}: \begin{verbatim} $ cat test_at_star #! /bin/sh testquote "$@" testquote "$*" $ test_at_star one two three You have 3 parameters. They are: parameter 1: one parameter 2: two parameter 3: three You have 1 parameters. They are: parameter 1: one two three \end{verbatim} Notice how \texttt{"\$*"} just turned into one long parameter that contains spaces. \subsection{Printing Output} \label{sec:output} We use \texttt{echo} to print things. By default, it puts a new line at the end. To avoid printing a newline, use \texttt{echo -n}: \begin{verbatim} $ cat echo-n #! /bin/sh echo "Hello " echo World echo -n "Hello " echo World $ ./echo-n Hello World Hello World \end{verbatim} \subsection{Reading Input} \label{sec:input} There are many ways of reading input, but one simple way is to use \texttt{read}; \begin{verbatim} $ read answer yes $ echo $answer yes \end{verbatim}%$ \section{The Basic Statements} \label{sec:statements} The shell is a complete programming language, and supports \texttt{for} loops, \texttt{while} loops, \texttt{if} statements, \texttt{case} statements (like \texttt{switch} in C), as well as function calls. We look at only a small subset of these. \subsection{The \texttt{if} statement} \label{sec:if} The syntax of the \texttt{if} statement is: \begin{alltt} if \emph{test-commands} then \emph{statements} fi \end{alltt} We can add an \emph{else}: \begin{alltt} if \emph{test-commands} then \emph{statements-if-true} else \emph{statements-if-false} fi \end{alltt} and we can have other \texttt{if} conditions nested inside, but they are introduced with a new keyword: \texttt{elif}: \begin{alltt} if \emph{test-commands} then \emph{statements-if-test-commands-1-true} elif \emph{test-commands-2} \emph{statements-if-test-commands-2-true} else \emph{statements-if-all-test-commands-false} fi \end{alltt} The \texttt{test-commands} is either: \begin{itemize} \item a program being executed, or \item a test made using the program \texttt{test}; see \texttt{man test} for all the tests you can make using \texttt{test}. Also see section~\vref{sec:test}. \end{itemize} A simple example: \begin{verbatim} if grep nick /etc/passwd > /dev/null 2>&1 then echo Nick has a local account here else echo Nick has no local account here fi \end{verbatim} We redirect all output from grep to avoid the side effect of printing the line grep found. If you want to put the \texttt{then} on the same line as the \texttt{if}, you need to put a semicolon before the \texttt{then}. Here is another example that adds the user \texttt{nicku} to the sudoers file if that user is not there already: \begin{verbatim} if ! grep nicku /etc/sudoers > /dev/null 2>&1; then echo "nicku ALL=(ALL) ALL" >> /etc/sudoers fi \end{verbatim} \subsection{The \texttt{while} statement} \label{sec:while} The format of the \texttt{while} statement is: \begin{alltt} while \emph{test-commands} do \emph{loop-body-statements} done \end{alltt} Again, if you want to put the \texttt{do} on the same line as the \texttt{while}, then you need an extra semicolon before the \texttt{do}. A simple example: \begin{verbatim} i=0 while [ "$i" -lt 10 ]; do echo -n "$i " # -n suppresses newline. i=`expr $i + 1` # i=$(($i+1)) also works. done \end{verbatim}%$ The square brackets are an example of the \texttt{test} program. See section~\vref{sec:test}. \subsubsection{\texttt{expr}} In the last example using a \texttt{while} loop, we used the program \texttt{expr} to do arithmetic. This is the portable way to do arithmetic in shell programming. Note that since \texttt{expr} prints its output on standard output, we use command substitution to assign the program output to the variable \texttt{i}. See the manual page for \texttt{expr} for more information. \subsection{The \texttt{for} statement} \label{sec:for} The format of the \texttt{for} statement is: \begin{alltt} for \emph{name} in \emph{words} do \emph{loop-body-statements} done \end{alltt} Here is a simple example: \begin{verbatim} for planet in Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune Pluto do echo $planet done \end{verbatim}%$ You can leave the \texttt{in \emph{words}} out; in that case, \texttt{\emph{name}} is set to each parameter in turn. Here is another example: \begin{verbatim} for i in *.txt do echo $i grep 'lost treasure' $i done \end{verbatim} Note that the shell will expand the wildcard characters into a list of file names that you can process one by one in the loop. \subsection{\texttt{break} and \texttt{continue}} \label{sec:break-and-continue} Inside loops you can the \texttt{break} and \texttt{continue} statements. They work like they do in C\@. \subsection{The \texttt{test} program} \label{sec:test} The \texttt{test} program is used to perform comparisons of strings, numbers and files, often used with the \texttt{if} and \texttt{while} statements. I will not waste space by copying the manual page here: do \texttt{man test} to read all about \texttt{test}. You can call the test program two ways: one as the name \texttt{test}, the other (more common way) as \texttt{[ ... ]}. If we look, there is a program called ``\texttt{[}'':%] \begin{verbatim} $ which [ /usr/bin/[ $ ls -l /usr/bin/[ lrwxrwxrwx 1 root root 4 Nov 25 13:36 /usr/bin/[ -> test \end{verbatim}%]]] \paragraph{Important Note:} there must be white space before and after the ``\texttt{[}'':%] \begin{verbatim} i=0 while ["$i" -lt 10]; do echo -n "$i " i=`expr $i + 1` done bash: [0: command not found \end{verbatim}%$] \subsection{Using the ``\texttt{\&\&}'' and ``\texttt{\textbar\textbar}'' Operators for Flow Control} \label{sec:operators} The shell has many operators; see the manual for a complete list. But here we look at two familiar operators that are surprisingly useful, and yet which may have a use that is unfamiliar to you. They are used for logical operations, and are shortcut logical operators, just as you are familiar with in the C and Java programming languages. However, in shell programming, they are used for flow control, rather like an \texttt{if} statement. Suppose we have a shell script that we must call with two parameters, and that it should fail if there are fewer or more parameters. We can use the `\texttt{\&\&}' operator after a test and exit. Here is a little shell script that will only accept two parameters and will exit with a help message otherwise: \begin{verbatim} #! /bin/sh [ $# -ne 2 ] && echo $0 parameter1 parameter2 && exit echo parameter1 is $1, and parameter2 is $2. \end{verbatim} So let's run it, first with no parameters, then with two: \begin{verbatim} $ ./two-parameters ./two-parameters parameter1 parameter2 $ ./two-parameters p q parameter1 is p, and parameter2 is q. \end{verbatim} The syntax is like this: \begin{alltt} \emph{command1} && \emph{command2} \end{alltt} \texttt{\emph{command2}} will execute only if \texttt{\emph{command1}} is successful. Similarly, the syntax for the `\verb!||!' operator is: \begin{alltt} \emph{command1} || \emph{command2} \end{alltt} \texttt{\emph{command2}} will execute only if \texttt{\emph{command1}} is \emph{not} successful. \section{Regular Expressions} \label{sec:regexp} Much of what a system administrator does is editing configuration files. There are tools to help with this; one such tool is the program \texttt{sed}; another is the programming language Perl\@. The one thing that comes in useful in both cases are \texttt{regular expressions}. The \texttt{grep} command also uses regular expressions. Regular expressions provide a way of matching patterns in a text file; they can also provide a way of altering the text that matches the pattern. Getting started with regular expressions is our aim today. A regular expression is a string of characters. Some of these characters have a special meaning; most do not. The characters with a special meaning are called \emph{metacharacters}. Here are some example regular expressions without metacharacters: \begin{verbatim} /nicku/ # simply matches the string "nicku" /hacker/ # simply matches the string "hacker" \end{verbatim} \subsection{Some Funny Characters (metacharacters)} \label{sec:metachars} \begin{description} \item[Asterisk: \texttt{*}] matches zero or more of the thing that came just before. Example: \texttt{1133*} matches 11 followed by one or more 3's, so it will match: 113 or 1133 or 11333 or 1133333333333333\ldots \item[Dot: \texttt{.}] matches any single character, except newline. So \texttt{".*"} matches zero or more of any character. \item[{Caret: \textasciicircum}] matches beginning of a line, or inside backets means something different (see below). \item[Dollar sign: \texttt{\$}] matches the end of a line. For example, ``\textasciicircum\texttt{\$}'' matches blank lines. \item[Brackets: \texttt{[...]}] matches one character from the set in the brackets. Examples: \texttt{"[xyz]"} matches the characters \texttt{x}, \texttt{y}, or \texttt{z}. \texttt{"[c-n]"} matches any of the characters in the range \texttt{c} to \texttt{n}. \texttt{"[B-Pk-y]"} matches any of the characters in the ranges \texttt{B} to \texttt{P} and \texttt{k} to \texttt{y}. \texttt{"[a-z0-9]"} matches any lowercase letter or any digit. \texttt{"[\textasciicircum{}b-d]"} matches all characters \emph{except} those in the range \texttt{b} to \texttt{d}. Combined sequences of bracketed characters match common word patterns. \texttt{"[Yy][Ee][Ss]"} matches yes, Yes, YES, yEs,\ldots \texttt{"[0-9][0-9][0-9][0-9][0-9][0-9]\allowbreak[0-9]\allowbreak [0-9]\allowbreak[0-9]"} matches any \IVE student number. \item[Backslash: \texttt{\bs}] quotes a metacharacter to take away its special meaning. You can match a literal \texttt{"\$"} with \texttt{"\bs\$"}, or a backslash with \texttt{"\bs\bs"}. \item[Ampersand: \texttt{\&}] means, in a replacement string, the string to be replaced. See below. \end{description} \subsection{Sed} \label{sec:sed} The \texttt{sed} program (\textbf{s}tream \textbf{ed}itor) is a non-interactive editing program. We will look only at a subset of its behaviour today: substitutions. Let's start with an example: \begin{verbatim} sed '/nicku/s//nickl/' /tmp/sudoers-orig > /tmp/sudoers \end{verbatim} On each line of the input file \texttt{/tmp/sudoers-orig}, \texttt{sed} will replace the first instance of \texttt{nicku} with \texttt{nickl} and send the result to the output. Let's pull that expression \texttt{/nicku/s//nickl/} apart to see how it works: It begins with an \emph{address}, which is a simple regular expression without any metacharacters: \begin{verbatim} /nicku/ \end{verbatim} This \texttt{sed} address matches all line on the input file that contain the string \texttt{nicku}. It will apply the substitute operation to them. The next part is a \emph{substitution expression}: \begin{verbatim} s//nickl/ \end{verbatim} The syntax of a sustitution expression is: \begin{alltt} s/\emph{pattern to replace}/\emph{replacement}/ \end{alltt} Here, the \emph{pattern to replace} is empty: that means that we use the value from the address pattern, so we are replacing the string \texttt{nicku} with the \emph{replacement}, \texttt{nickl}. So if the input file \texttt{/tmp/sudoers-orig} contains this line: \begin{verbatim} nicku ALL=(ALL) ALL \end{verbatim} then the output file will contain: \begin{verbatim} nickl ALL=(ALL) ALL \end{verbatim} instead. Here is another example: \begin{verbatim} sed '/^\/misc/s//#&/' /tmp/auto.master-orig > /tmp/auto.master \end{verbatim} What does this do? It takes as input the file \texttt{/tmp/auto.master-orig}, then finds a line starting with \texttt{/misc}, and puts a comment character before it. The edited output file is \texttt{/tmp/auto.master}. Again, let us examine this expression \verb!'/^\/misc/s//#&/'! part-by-part: It begins with an \emph{address}, which (in this case) is a regular expression: \begin{verbatim} /^\/misc/ \end{verbatim} This means: apply the command to lines that start with (the \texttt{"\textasciicircum"} metacharacter) \texttt{/misc}. We have to use a backslash to quote the forward slash, because otherwise the forward slash would mark the end of the regular expression, rather than a literal forward slash. The rest is a \texttt{substitution expression}: \begin{verbatim} s//#&/ \end{verbatim} Again the \emph{pattern to replace} is empty: that means that we use the value from the address pattern, so we are replacing the string \texttt{/misc} with the \emph{replacement}. The hash symbol \texttt{"\#"} in the replacement expression is a literal hash, i.e., the comment character that we are inserting. The special metacharacter \texttt{"\&"} has the value of the entire \emph{pattern to replace}. So we are replacing a line on the input file like this: \begin{verbatim} /misc /etc/auto.misc --timeout 60 \end{verbatim} with this in the output:q \begin{verbatim} #/misc /etc/auto.misc --timeout 60 \end{verbatim} \subsection{Where can I find out more about sed?} \label{where-can-i-find-out-more-about-sed} The book \url{http://www.linuxdoc.org/LDP/abs/html/index.html} has an appendix about \texttt{sed}. It has a rather limited manual page, but there is an \acro{FAQ} at \url{http://www.ptug.org/sed/sedfaq.htm}. \section{Finding examples of shell scripts on your computer} \label{sec:examples} Your Linux system has a large number of shell scripts that you can refer to as examples. I counted about 1400. Here is one way of listing their file names: \begin{verbatim} $ file /bin/* /usr/bin/* /usr/sbin/* /sbin/* /etc/rc.d/* /usr/X11R6/bin/* \ | grep -i "shell script" | awk -F: '{print $1}' \end{verbatim}%$ Let's see how this works. I suggest executing the commands separately to see what they do: \begin{verbatim} $ file /bin/* /usr/bin/* $ file /bin/* /usr/bin/* | grep -i "shell script" $ file /bin/* /usr/bin/* | grep -i "shell script" | awk -F: '{print $1}' \end{verbatim} The \texttt{awk} program is actually a complete programming language. It is mainly useful for selecting columns of data from text. \texttt{awk} automatically loops through the input, and divides the input lines into fields. It calls these fields \texttt{\$1}, \texttt{\$2},\dots\texttt{\$NF}\@. \texttt{\$0} contains the whole line. Here the option \texttt{-F:} sets the \emph{field separator} to the colon character. Normally it is any white space. So printing \texttt{\$1} here prints what comes before the colon, which is the file name. Suppose you want to look for all shell scripts containing a particular command or statement? Looking for example shell scripts that use the \texttt{mktemp} command: \begin{verbatim} $ file /bin/* /usr/bin/* /usr/sbin/* /sbin/* /etc/rc.d/* /usr/X11R6/bin/* \ | grep -i 'shell script'| awk -F: '{print $1}' | xargs grep mktemp \end{verbatim}%$ Here is a useful little shell script that does this: \begin{verbatim} #! /bin/sh if [ $# -eq 0 ] then cmd=`basename $0` echo $cmd: search all Bourne shell scripts for a command echo usage: $cmd [grep-options] command-to-grep-for echo the grep-option -l is useful exit 1 fi ( IFS=: for d in $PATH do file $d/* done find /etc/rc.d -type f | xargs file ) \ | grep 'Bourne.* shell script' \ | awk -F: '{print $1}' \ | xargs grep "$@" \end{verbatim}%$ We run the \texttt{for} loop in a sub shell to make the change to \texttt{IFS} local. \texttt{IFS} is the ``internal field separator''. The shell will automatically split lines into fields separated by the \texttt{IFS}\@. \subsection{Where can I find out more about awk?} \label{Where can I find out more about awk?} There is a whole book about \texttt{awk}; you can buy it from O'Reilly for about \$300 HK, or you can read it online at \url{http://www.ssc.com/ssc/eap/}. % % perl -e '@path=split /:/, $ENV{PATH};for $d ( @path ) {print "$d\n"}' % find /etc/rc.d -type f | xargs file | grep -i 'shell script'\ % | awk -F: '{print $1}' | wc -l % 117 % perl -e '@path=split /:/, $ENV{PATH};for $d ( @path ) {print "$d\n"}' \ % | while read d;do file $d/*|grep -i 'shell script' \ % | awk -F: '{print $1}'; done | wc -l % 1275 % % Better than using perl, use word splitting in shell: % (IFS=:;for d in $PATH;do echo $d;done) %$ \section{Debugging Shell Scripts} \label{sec:debugging} It is best to write shell scripts incrementally: write part, test that it works, and continue until your script does what is required. You can use \emph{echo} statements to print the values of variables. You can run the script with the \emph{verbose} option to bash. For a script called \texttt{script}, you could run it in verbose mode like this: \begin{verbatim} $ sh -v script \end{verbatim}%$ You can see each command after it has been expanded by using the \texttt{-x} option: \begin{verbatim} $ sh -x script \end{verbatim}%$ \section{Common Mistakes} \label{sec:common-mistakes} I see many people making the same mistakes. This is due partly to the difference in the shell from other programming languages, and partly due to missing lectures or being late in the laboratory! \verb!:-)! \begin{description} \item[Spaces are important!] The shell cares about spaces much more than other programming languages. This is because it does so many different things; if you put \begin{verbatim} i; \end{verbatim} in a C program, it is just an expression that is evaluated, the result is thrown away, and nothing happens. The shell, on the other hand, will look for an program by the name \texttt{i}, and execute it. The shell breaks things up into separate tokens at white space. Where you put spaces does matter. \begin{description} \item[Don't put spaces in assignments:] An assignment is a single thing. If you put spaces in it, the shell will try to execute a program by the name of the variable you are trying to assign to! \begin{verbatim} i =20 bash: i: command not found \end{verbatim} \item[\texttt{eval} needs spaces:] You need to put spaces between the operands of the external program \texttt{eval}: \begin{verbatim} i=0 i=`eval i+1` bash: i+1: command not found \end{verbatim} \item[Put spaces around the \texttt{[ ... ]}] See the notes in the section\vref{sec:test}. \end{description} \item[Use meaningful variable names:] I saw people get confused about variables such as \texttt{\$1}, \texttt{\$2}. Assign them to meaningful names, and you won't get so confused. Use what your other lecturers taught you about good programming practice! \end{description} %\clearpage \section{Questions} \label{sec:questions} Make all these scripts executable programs on your \texttt{PATH}. \begin{enumerate} \item Write a simple shell script that takes any number of arguments on the command line, and prints the arguments with ``Hello '' in front. For example, if the name of the script is \texttt{hello}, then you should be able to run it like this: \begin{verbatim} $ hello Nick Urbanik Hello Nick Urbanik $ hello Edmund Hello Edmund \end{verbatim} \item Write a simple shell script that takes two numbers as parameters and uses a \texttt{while} loop to print all the numbers from the first to the second inclusive, each number separated only by a space from the previous number. Example, if the script is called \texttt{jot}, then \begin{verbatim} $ jot 2 8 2 3 4 5 6 7 8 \end{verbatim}%$ \item Suppose that the script you wrote for the previous question is called \texttt{jot}. Then run it calling \texttt{sh} yourself. Notice the difference: \begin{verbatim} sh jot 2 5 sh -v jot 2 5 sh -x jot 2 5 \end{verbatim} Do you notice any difference in the output from last two? \item Write a shell script that, for each \texttt{.rpm} file in the current directory, prints the name of the package on a line by itself, then runs \texttt{rpm -K} on the package, then prints a blank line, using a \texttt{for} loop. Mount the server \texttt{ictlab\allowbreak.tyict\allowbreak.vtc\allowbreak.edu\allowbreak% .hk:\allowbreak/var\allowbreak/ftp\allowbreak/pub} on a convenient directory on your machine, such as \texttt{/mnt/ftp}. Test your script on the files in \texttt{/mnt\allowbreak/ftp\allowbreak/rh-7.2-updated\allowbreak% /RedHat\allowbreak/RPMS}\@. \begin{explanation} The option \texttt{rpm -K} chec\textbf{k}s that the software package is not corrupted, and is signed by the author (if you have imported the author's public key in your \texttt{gpg} setup) \end{explanation} \item Modify the script you wrote for the previous question to print the output of \texttt{rpm -K} \emph{only} for \emph{all} the files that fail the test. In particular, if the package's \acro{GPG} signature fails, then your script should display the output of \texttt{rpm -K}\@. There are at least two packages in this directory which do not have a valid \acro{GPG} signature; one of them is \texttt{redhat-release-7.2-1.noarch.rpm}; what is the other? Here is output from \texttt{rpm -K} for two packages, one with no \acro{GPG} signature, the other with: \begin{verbatim} $ rpm -K redhat-release-7.2-1.noarch.rpm bash-2.05-8.i386.rpm redhat-release-7.2-1.noarch.rpm: md5 OK bash-2.05-8.i386.rpm: md5 gpg OK \end{verbatim}%$ Test it in the same network directory as for the previous question. \item Write a shell script to add a local group called \texttt{administrator} if it does not already exist. Do not execute any external program if the \texttt{administrator} group already exists. \item Download a copy of the bogus student registration data from \url{http://ictlab.tyict.vtc.edu.hk/snm/lab/regular-expressions/artificial-student-data.txt}. Use this for the following exercises, together with the \texttt{grep} program: \begin{enumerate} \item Search for all students with the name ``CHAN'' \item Search for all students whose student number begins and ends with 9, and with any other digits in between. \item Search for all student records where the Hong Kong ID has a letter, not a number, in the parentheses. \item If you have time, you may do the same exercises, but display only the students' names, or student number. \end{enumerate} \item Write a shell script to take a file name on its command line, and edit it with \texttt{sed} so that every instance of ``\texttt{/usr/local/bin}'' is changed to ``\texttt{/usr/bin}'' \item Write a shell script to take a file name on its command line, and edit it using \texttt{sed} so that every line that begins with the string \texttt{server}: \begin{alltt} server \emph{other text} \end{alltt} is edited so that averything after ``\texttt{server~}'' (i.e., the ``\texttt{\emph{other text}}'') is replaced with the string ``\texttt{clock.tyict.vtc.edu.hk}'', so that the line above looks like this: \begin{alltt} server clock.tyict.vtc.edu.hk \end{alltt} Test this on a copy of the file \texttt{/etc/ntp.conf} that is on your computer. (Install the package \texttt{ntp} if it is not there). \end{enumerate} \end{document}