Erotic Fantasy: /bin/sh Programming

Chris Keane





Introduction

/bin/sh. The Bourne Shell. The Bell Shell. The completely retarded pedantic bastard shell from hell. Call it what you will, no matter what Unix system you're using, it's bound to be there. So here's a brief section on how you can program it in its own built in scripting language.

Believe it or not, sh scripts are really really useful things. The basic idea is that a shell script is a plain ASCII file, containing a sequence of commands you'd ordinarily type in on the command line. However, if the list of commands needed to perform some function is quite long or complex, it makes sense to type them once in a script and run the script each time after that (which is also a hell of a lot faster than typing 1000 lines every timegif).

In addition to commands that you would type in at the command prompt, there are other features, such a flow control statements, usually only found in high level languages. This facility makes it easy to do perform New And Exciting And Powerful tasks.



Shell Programming and Debugging

Input a Program

A shell program is a straight ascii file containing a series of shell commands, is input by any means that one would input text into the Unix system--- vi, cat, ed for normal people and emacs or aXe for masochists.

Theoretically, a shell script should start with the magic incantation #!/bin/sh on the first line and hard up against the left margin. In practice, this is generally not required as most modern systems assume a program is a shell script if it is ascii text and executablegif. One thing to note is that you shouldn't have a hash (#) by itself on the first position of the first line, as this is a incantation to invoke the C-shell (csh) rather than the Hell Shell (sh). Programming csh is not really discussed in this manual.

To demonstrate a shell script, we'll input a program to displays the date and time. (Use Vi, the ProgSoc editor of choice, to enter the code below. Then use the UNIX command cat to check everything was input correctly.)

#!/bin/sh
date



Making the program executable

When a text editor creates a file in Unix, it has permissions directly related to your umask. The umask default permission is read and write, but not execute. To make your shell script executable, add execute permission to your script file using chmod.

foo% ls -l d
-rw-r--r--    1    chris    15    Jan  4  15:20 d
foo% chmod +x d
foo% ls -l d
-rwxr-xr-x    1    chris    15    Jan  4  15:20 d
foo%



Running the program

Once the script has been successfully entered and made executable, it can be run. There are two ways to execute a shell script. The first and easiest is to type in its name.

foo% d
Mon Jan  4 15:21:56 EST 1993
foo%

An alternative is to run the script through /bin/sh as a direct interpreter.

foo% sh d     
Mon Jan  4 15:21:56 EST 1993

or

foo% sh < d 
Mon Jan  4 15:21:56 EST 1993

Using sh has the same result, but d is treated as a simple ASCII file sh rather than a program in itself.



Debugging shell programs

/bin/sh provides facilities to trace the execution of shell scripts. These are the -x and -v flags to the /bin/sh program. Using the earlier example of the d, we can apply the -x option to examine the steps passed through by the executing program.

foo% cat d
#!/bin/sh
date
foo% sh -x d
+ date
Mon Jan  4 15:21:56 EST 1993
foo%

Before each command in the script is executed, it is printed with a preceding + symbol. The program code below ( another program called d) is executed as an example:

foo% cat d
#!/bin/sh
date
echo foobie doobie
who | grep chris
echo `hostname`
foo% d    
Mon Jan  4 15:22:30 EST 1993
foobie doobie
chris    console Jan  4 15:19
chris    ttyp0   Jan  4 15:20
chris    ttyp1   Jan  4 15:20
chris    ttyp2   Jan  4 15:20
foo
foo% sh -x d
+ date 
Mon Jan  4 15:22:30 EST 1993
+ echo foobie doobie
foobie doobie
+ who | grep chris
chris    console Jan  4 15:19
chris    ttyp0   Jan  4 15:20
chris    ttyp1   Jan  4 15:20
chris    ttyp2   Jan  4 15:20
+ echo foo
foo

The last echo command comes out as ``+ echo foo'' rather than ``+ echo `hostname`'' because `hostname` is enclosed in backquotes, which have a special meaning to shell (to be explained a little further on).

If the -v option was used instead of the -x option, the debugging output from the execution would not be rewritten, that is, in the above example ``+ echo `hostname`'' would have appeared instead of ``+ echo foo''.



Standard I/O And Redirection

Streams of Characters

Under Unix all files, whether ASCII or binary, are treated as simple streams of bytes. There is no structure imposed on the stream other than that imposed by application programs reading data. As a result, all files may generally be treated the same for file redirection etc.



File redirection

Because files are simple streams of characters, it becomes a trivial matter to read their contents into programs or for programs to write information to files. Overall, there are three facilities for information interchange. These are called ``standard in'' (stdin), ``standard out'' (stdout) and ``standard error'' (stderr). In addition, other files may be opened and closed by the shell script.

stdin
Under normal circumstances, stdin is the keyboard. This is generically how a shell script receives data input from the user and/or a file. Stdin can be redirected in two main ways:

<
(arrowhead in) usage:

cmd < file
The arrowhead-in (less than) symbol used in the above fashion caused the contents of file to be input to the command cmd as if they had been typed on the keyboard. Normal keyboard activity is ignored.

<<
(arrowhead up-'til) usage:

cmd << STRING
text
text
STRING

The arrowhead up-'til (double less than) symbol is used to divert stdin in a similar fashion to arrowhead in. Execution finishes when STRING is reached: this can be any arbitary string of characters with meaning to you. e.g.

foo% mail chris@foo << EOF
Hi there chris! 
Foobie Doobie!
EOF
foo%

stdout.
Usually, stdout is the screen to which output is printed. Similar to stdin, however, there are a number of ways to redirect this output to files etc.

>
(arrowhead out) usage:

cmd > file
All stdout output from the execution of cmd is written to a new file called file. If file exists, it is overwritten with the output of cmd.

>>
(arrowhead append) usage:

cmd >> file
Similar to ``arrowhead out'', arrowhead append causes all output from the execution of cmd to be written to a file called file. However, if this file already exists, the output is appended to end of the file rather than overwriting it.

stderr
Programs with internal errors will generally attempt to write error messages to stderr. Like stdout, this usually appears on the screen, but it is separate from stdout so error messages can be seen even if stdout is redirected. Alternatively, stderr can be redirected to the same file as stdout, or another file again.

2>    (file arrowhead out)
2>&1  (arrowhead and)
usage:
cmd 2> errfile
cmd > file 2>&1

To understand this, you need to know that internally the /bin/sh has special file descriptors for stdin, stdout and stderr. These are 0, 1 and 2 respectively. The ``file arrowhead out'' symbol places any output from the given descriptor---in this case stderr, descriptor 2--- into the named file. Note that

cmd 2>> errfile

is also legal if you wish to append to a file.

The ``arrowhead ampersand'' symbol assigns to file descriptor 2 (stderr) the same output file as file desciptor 1 (stdin).

other files
As mentioned above, stdin, stdout and stderr are assigned file descriptors 0, 1 and 2 respectively. On a Unix system, individual processes can hold up to 20 files open simultaneously. This leaves descriptors 3-19 available for use on an ad hoc basis. Files may be opened with the exec command. Usage:

exec 3> file3
exec 4< file4
and then used as described in the `stderr' section above. e.g:
cmd 1>&3                To output to file3
cmd 0<&4            To input from file4

pipes
The concept of pipes is similar to file redirection. But instead of the output of one command being placed in a file, it is connected to the input of another process (which may in turn have its output connected to another process). E.g. ls | sort takes output from tt ls and feeds it to sort. Result: a sorted directory list.

Pipes are discussed in the UNIX section of this manual.



Devices are files

One concept that may seem strange is that devices under Unix are also treated as ordinary files at the shell level. As a result, you can use standard redirection techniques such as those described above to read from and write to devices. Since quite a lot of this type of behaviour is likely considered quite naughty, and will probably get you a good solid spanking, this topic won't be further expounded on. So don't blame me when you get caught sending banners to other logged on users. Whoops.



Useful Unix Commands

When writing shell scripts, its useful to know what's in your toolkit. This section gives a brief description of the utilities that get the biggest workout in shell scripts. Each of the commands mentioned here have manual page entries---read them. ( man command-name.)

awk
Awk is big and powerful. It does lots of things. No-one really understands it, including (it is rumoured) the authors. However, for those initiated into its arcane ways, awk is a batch spreadsheet, a programming language, a calculator, a string match and replacer and other things that are more rarely used. There are whole manuals devoted to awk. Read one.

bc
Bench calculator. bc is a really fabbo program that provides maths support in the shell. There's another maths program called expr, but it doesn't support floating point math. bc can be used in interactive mode or as part of a script. For interactive,

foo% bc
2 + 3
5
^D
foo%

for batch processing,

foo% echo '2 + 3' | bc
5
foo%

The result of the expression is written to stdout, and so can therefore be redirected, assigned to a variable or piped to another application.

cat
Concatenate files. cat takes command line arguments of one or several files and prints them onto stdout one after another (thus concatenating them). If there are no files specified on the command line, cat reads stdin and writes it to stdout.

foo% cat File1
Hi! I'm file1!
foo% cat file2
Hi! I'm file2!
foo% cat file1 file2
Hi! I'm file1!
Hi! I'm file2!
foo% cat < file1
Hi! I'm file1!
foo%

date
Prints the current date and time, taken from the system. There are quite a lot of ways to format the date. The default format is along the lines of ``Mon Jan 4 18:32:46 EST 1993''. However, this may be modified with command line arguments (e.g. date +%y%m%d results in ``930104'')

diff
Show every line that differs between two file. No output from diff tells you that the two files specified on the command line are identical.

echo
Echo arguments to stdout. echo simply prints whatever is on its command line to stdout. There are a number of options - the most used is the -n option that doesn't put a new-line at the end of the output, so the next output goes on the same line. However, if leading or trailing white-space is required, the required output must be enclosed in quotes or the white-space will be discarded.

foo% echo hello
hello
foo% cat > /tmp/qwerty
echo -n "hello "
echo there
^D
foo% chmod +x /tmp/qwerty
foo% /tmp/qwerty
hello there
foo%

grep
No-one knows what grep stands for. Some theorise global search and replacegif. grep searches for instances of strings in one or more files (if files are specified on the command line) or from stdin (if no files are specified). e.g.

foo% grep tree /usr/dict/words
Peachtree
rooftree
street
streetcar
tree
treetop
foo%



Protecting and Processing

Quoting and Rewriting

In the /bin/sh, there is a small subset of characters that are ``reserved'', that is, they have special meaning to the shell. For example, reserved characters may be >, <, | etc for file redirection and pipes and *, ?, [, -, ] for pattern matching. It would be difficult to use these reserved characters on the command line for some command, for example.

foo% echo old -> new
foo% ls -l new
-rw-r--r--    1    chris    6  Jan 28 16:08 new
foo% cat new
old -
foo%

The use of echo in the above fashion creates a new file and writes the string ``old -'' into it---the ``>'' character opens the file as stdout.

There are a number of ways to protect reserved characters so you get what you really want.

Bash (Back slash). Bash protects the character immediately following it from being executed by the shell.

foo% echo old -\> new
old -> new
foo%

So what if you want to print out a bash? Putting it on the command line results in it protecting the following character, rather than being printed itself. The answer is to protect the bash with another bash.

foo% echo old =\= new
old == new
foo% echo old =\\ = new
old =\= new
foo%

[ " ] Quotes (double quotes). Double quotes are used around strings containing any reserved character except back-quote (see below) and the variable indicator ``$''.

foo% echo "old -> new"
old -> new
foo% ls "*"
* not found
foo%

[ ' ] Quote (forward quote). The single quote provides the greatest protection against rewriting and reserved characters. All reserved characters, including double quotes, are protected by single quotes.

[ ` ]
Back Quote (backwards quote). The backquote is a rather special symbol. In /bin/sh, unless the backquote is protected by one of the valid means above, the contents of the string contained between the backquotes will be executed and the output from that execution written in place of the backquoted string.

foo% echo "You are currently logged into host `hostname`"
You are currently logged into host foo



Shell Variables

Positional Parameters

Most Unix commands take command line arguments to modify their behaviour. For example, supplying the -l argument to the ls command results in a ``long'' listing of files in the current directory.

In /bin/sh, command line arguments are passed to shell script as shell variables. Shell variables are similar to variables in other higher-level languages, but do not need to be declared before use.

From inside the shell script, the command line arguments may be referenced a number corresponding to their position on the command line. For example, the first command line argument would be $1, the second $2 etc.

foo% cat greet
#!/bin/sh
echo hello $1
foo% greet Chris
hello Chris
foo%



The Shift Command

With positional parameters described above, it is possible to ``shift'' them down so that parameters already recognised and processed may be discarded and the next parameter takes its place, for example, $1 is discarded, $2 becomes $1, $3 becomes $2 etc.

foo% cat greet
#!/bin/sh
echo hello $1
shift
echo hello $1
foo% greet Chris Kylie
hello Chris
hello Kylie
foo%

The shift command is particularly useful for looping commands to process the entire command line painlessly.



Using shell variables

Shell variables are labels for string storage in /bin/sh. There is no distinction between variables that contain a single letter, multiple letters or one that contains a numeric value. In /bin/sh, all variables are treated as containing zero or more characters.

Before a value is assigned to a variable, it exists but contains zero characters. Variables may be introduced and used at any stage of the shell script and there is no need ( there is no facility) to declare them before use. Traditionally, shell variable names are in all upper case, but there is no technical reason except readability that this is the case.

Wherever they appear in a shell script, an instance of a variable name is replaced by that variable's contents. If a variable contains an executable command, that command will be run (if the variable name is in a position that makes this possible).

foo% cat greet
#!/bin/sh
HELLO="Hello $1"
echo $HELLO
shift
HELLO="$HELLO, $1"
echo $HELLO
foo% greet Chris Kylie
Hello Chris
Hello Chris, Kylie
foo%



The Export Command

Out there in the great ether of things hanging around your account is the ``environment''. You can look at what's in your environment with the env command. You can add things to your environment using the export command from within your script.

Your environment is passed as a series of shell variables to all programs you run, that is, to all the child processes of the current process. It is impossible, however for a child process to change the environment of its parent process. If you export a variable from within your shell script, it is passed to all programs that you run from that shell script, but is not passed back to the process that originally ran the script.



Variables Set Automatically

These shell variables are automagically set by /bin/sh when a shell script is executed.

$*
All command line arguments
foo% cat argv
#!/bin/sh
echo $*
foo% argv 1 2 3 4
1 2 3 4

$#
The number of arguments on the command line.
foo% cat argc
#!/bin/sh
echo $#
foo% argc 1 2 3 4
4
foo%

$?
Return status. When a process finishes execution, it normally returns a value indicating the success or failure of its function. Zero is normally used for success, any other number for failure.

$$
Process ID. The Unique Unix process ID for the current process. This is ideal for generating temporary file, since no two processes have the same process ID.

foo% cat mktmpf
#!/bin/sh
TMP=/tmp/temp$$
echo $TMP
foo% mktmpf
/tmp/temp23106
foo%

$-
Options. Carries a list of shell and set options that are current.

$!
Background ID. The variable contains the process ID of the last background process run by the current shell.



More About Variables

Because variables aren't declared, it's often difficult for /bin/sh to understand what you're really talking about. In addition, it is possible to provide alternate actions depending on whether a variable is set or not.

${VAR}
This is an alternate way of referencing variables that clearly defines where the name of the variable begins and ends. Whereas

foo% echo $HOMEiswheretheheartis
foo%

is horribly confusing for both humans and /bin/sh (which would look for a variable called $HOMEiswheretheheartis),

foo% echo ${HOME}iswheretheheartis
/system/usr/chrisiswheretheheartis
foo%

is much more intelligible.

${VAR:-string}
If ${VAR} has not been set, return the value of string, but do not set ${VAR}.

${VAR:=string}
If ${VAR} has not been set, return the value of string, and set ${VAR} to contain the value of string.

${VAR:?string}
Return the value of string as an error if variable has not been set.

foo% cat thingy
#!/bin/sh
HOME=
echo ${HOME:?"No HOME directory set"}
foo% thingy
sh: HOME: No HOME directory set
foo%



Variable Substitution Via Backquotes

As discussed previously, backquotes run the command they surround and rewrite the surrounded command with its output. This may be assigned to variables, as well.

foo% cat host
#!/bin/sh
HOSTNAME="You are currently logged into `hostname`"
echo $HOSTNAME
foo% host
You are currently logged into foo
foo%



Variable Substitution from Keyboard

On occasion, it is necessary to read input from the keyboard or stdin into a shell variable. This can be done with the use of the read command, which is built into /bin/sh.

foo% cat prompt
#!/bin/sh
echo -n "What is your name? "
read NAME
echo Hello, $NAME
foo% prompt
What is your name? Arthur, King of the Britons
Hello, Arthur, King of the Britons
foo%

The read command will assign the string that is read from stdin to the variable given on its command line. If this works, read returns zero (success) status. If some fault occurs---the user types CTRL-D or the end of file is reached if stdin is being redirected from a file---then read returns non-zero status.



Control Flow

/bin/sh has some flow control constructors built in, similar to high level languages, but slightly different to take account of the string nature of /bin/sh variables and the return status of programs. There are four available constructors: if, while, for and case.

if list1 then list2 elif list3 else list4 fi

The if construct executes the single command in list1. If the command returns a zero value (i.e. success), the commands in list2 are executed in sequence. If list1 returned a non-zero value, the the first element in list3 (if it exists) is executed; if this element returns a non-zero value, the remaining list3 commands are executed. If neither list1 nor the first element of list3 return a zero status, list4 is executed if it exists.

A very useful program to use as part of list1 is the test or [ program. Test performs a large number of functions such as string matching and file existence testing. man test gives you the full run-down.

foo% cat query 
#!/bin/sh

echo -n "What is your name? "
read NAME
if [ "$NAME" = "Arthur" ]
then
    echo "Hello!"
else
    echo "Who the hell are you?"
fi
foo% query
What is your name? Arthur
Hello!
foo% query
What is your name? Sir Galahad
Who the hell are you?
foo%

foo% cat rl
#!/bin/sh
echo -n "Hostname to rlogin to? "
read HOST
if [ "`hostname`" = "$HOST" ]
then
    echo "You are already logged into $HOST!"
    echo "Swine!"
else
    echo "rlogin to $HOST"
    rlogin $HOST
fi
foo% rl
Hostname to rlogin to? foo
You are already logged into foo
Swine!
foo% rl
Hostname to rlogin to? biggles
rlogin to biggles
Password: ......

In the above example, the [ argument is actually a program that lives in /usr/bin. It has the same functionality as the test program, but looks a lot nicer. Because it is a program, it requires the same spacing as would normally be expected.

The arguments to the [ program are "$NAME", =, "Arthur" and ]. $NAME is surrounded by doubles quotes in case the string that $NAME represents contains a space character, which would severely stuff up the command line arguments.

while list1 do list2 done

While will continue to execute both lists of commands until list1 fails. This is useful with the read command. read returns a zero value and an initialized shell variable upon success, and a non-zero value upon cntrl-d or end of file.

foo% cat foobie
#!/bin/sh
while read FOOBIE
do
    echo $FOOBIE
done
foo% foobie
Repeat after me
Repeat after me
hello
hello
Stop it!
Stop it!
^D
foo%

There is a rarely used construct similar to while---the until construct. The difference is that both list1 and list2 are executed until list1 succeeds.

for VAR in word-list do command-list done

Unlike most high level languages, the for loop in /bin/sh does not increment a counter or whatever, but loops around placing each element of word-list into the shell variable VAR. Word-list may be normal shell pattern-match.

foo% cat thingy
#!/bin/sh
for THINGY in thingy foobie wookly dwang
do
    echo "for $THINGY"
done
foo% thingy
for thingy
for foobie
for wookly
for dwang
foo%

foo% cat doobie
#!/bin/sh
# Changes all files in the current directory
# ending with .out to .foo
TALLY=0
# Start the for loop, substitute .out on the end of files
# with nothing when it gets stuffed into $FILE
for FILE in `ls *.out | sed "s/.out//"`
do
    mv ${FILE}.out ${FILE}.foo
    echo moved ${FILE}.out to ${FILE}.foo
    TALLY="`echo $TALLY + 1 | bc`"
done
echo $TALLY files renamed.
foo% doobie
moved goo.out to goo.foo
moved swine.out to swine.foo
moved thing.out to thing.foo
3 files renamed.
foo%

case word in pattern) list1;; ... ;; esac

The case construct matches the supplied word against the list of possible patterns. It proceeds in order though the list of patterns supplied and executes the list associated with the first matching pattern The patterns/lists are separated by a double semi-colon. The pattern may be literal string which must match the supplied word exactly, or it may be a wildcard that will match a range of words.

foo% cat args
#!/bin/sh

while [ -n $1 ]
do
    case $1 in
    -a)     OPTA=true
        ;;
    -b)     shift; 
        BFILE=$1
        OPTB=true
        ;;
    -c|-d)     OPTCD=true
        ;;
    *)    echo Option $1 unknown 1>&2
        exit 1
        ;;
    esac
    shift
done

args is a program that accepts the command line arguments -a, -b, -c and -d. If argument -b is supplied, an additional argument ($BFILE) is expected.



Built-in Commands

/bin/sh has a number of built-in commands, some of which have been discussed previously. Other interesting ones may be looked up in the manual page for /bin/sh. ( man sh.)

break, continue, cd, eval, exit, export, read, readonly, 
set, shift, test, trap, wait



Disclaimer

Some of the shell script example programs used in this chapter are based on similar examples in "The Unix Shell Programming Language" a book by Manis and Meyer.



About this document ...

This document was generated using the LaTeX2HTML translator Version 95.1 (Fri Jan 20 1995) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html shellprog.tex.

The translation was initiated by Piers Edmund Johnson on Tue Apr 9 15:28:04 EST 1996


Piers Edmund Johnson
Tue Apr 9 15:28:04 EST 1996