|Invocation Flags Definitions Blanks Names Parameters Script Example Screen Output|
The Borne Shell Syntax
The next few sections are all based on the syntax
rules for the Bourne shell as listed in the man pages for
my system. Compare these rules with those of your system. There
may be slight differences where the specification of the sh is
loosely defined. Where there are these minor differences, experiment
for yourself to validate that what is written agrees with what
happens. Trust no-one and test everything until you are happy
you fully understand each point. Only then can you begin to use
the sh to your advantage. Make the syntax rules second nature.
Lets read what it says.
The Borne Shell is available in three forms on most systems. These are:
The syntax and usage rules are common across all these types except where noted. In general, the rsh is more secure and forces users to comply with additional rules imposed by the system manager, while the jsh adds some features which aid the control of background processes (batch jobs) from the users interactive session.
Lets look at the standard man page information and interpret what this represents in some real world examples. First is the SYNOPSIS section which gives very brief syntax information regarding all versions of the shell. On my system, the list of three lines shows sh and jsh then /usr/lib/rsh indicating that the path for the restricted shell is not normally included in the users path. This is because of an unfortunate conflict between the spelling of rsh and rsh (!) One being the restricted shell, the other being the remote shell command which allows a shell process to be started on a remote system. For instance you might want to list your home directory on a remote machine but not want to login and do any work on the system. To do this the command rsh remote_system ls -l where remote_system is the TCP/IP alias of the remote machine, would be useful.
In square brackets following the command name is a list of flag parameters which modify the way the command behaves. You do not need to use any of these, indeed the square brackets indicate that they are optional, but quite a few are useful on occasion. In common with most man pages my system lists the flag characters then forgets to say anything else about them until pages 13 (under SET) and 15 (under INVOCATION) by which time the reader has completely forgotten where they came from. It is never clarified on my system that the SET flags are the same as the ones listed under the SYNOPSIS section. However there is an inference at the end of SET which indicates "$1, $2 etc., following the flags, will be treated as input parameters for the shell" - that's your only clue. The flags will be covered as appropriate within the text where relevant. I won't bother elucidating the DESCRIPTION section as this has been covered in some detail above.
The next bit on my systems man pages is DEFINITIONS where it tries to explain some very basic facts about key words used in the rest of the document. Some of these definitions are not always very clear and a misinterpretation here can lead to later confusion. Lets try and take these one step at a time.
A blank is a tab or space. What this actually means is - a blank is any chunk of white space between anything that is printable (a character or word). So blank can be several spaces or tabs or a combination of multiples of the two.
A name is a sequence of ASCII letters, digits, or underscores, beginning with a letter or an underscore. Well, almost. What they are really saying here is - these are the rules for a variable namename or function name within a shell script program. What has been omitted here is that the names are case sensitive, you can mix case within a name (LikeThisOne), and they don't always have to start with a letter or an underscore (See - Parameters ). It is never stated what the length limit is for a name. The limit on my system is 31 characters. Names longer than 31 characters do not give rise to any error messages, but if you have several names which only differ after character 32, then the shell will treat them all as the same variable. This can lead to unexpected results. You have been warned.
A parameter is a name, a digit, or any of the characters *, @, #, ?, -, $, and !\^. So, what's the difference between a name and a parameter exactly? Not much actually, it's all in the usage. If a word follows a command, as in: ls -l word , then word is one of the parameters (or arguments) passed to the ls command. But if the ls command was inside a sh script, then in all likelihood the word would also be a variable name. So a parameter can be a name when passing information into some other command or script. Viewed from inside a script however, the command line arguments appear as a line of positional parameters named by digits in the ordered sequence of arrival (See - Script Example_1.1 ). So a parameter can also be a digit. The other characters listed are special characters which are assigned values at script start up and may be used if required from within a script.
Well after reading through the above, I am still not sure if this is any clearer. Lets see if an example can help to clarify things a little.
#!/bin/sh -vx ####################################################### # example_1.1 (c) R.H.Reepe 1996 March 28 Version 1.0 # ####################################################### echo "Script name is [$0]" echo "First Parameter is [$1]" echo "Second Parameter is [$2]" echo "This Process ID is [$$]" echo "This Parameter Count is [$#]" echo "All Parameters [$@]" echo "The FLAGS are [$-]"
If you execute the script shown above with some arguments as shown below, you will get the output on your screen that follows.
user@system$ example_1.1 fred bill bert
+ echo "Script name is [$0]" Script name is [example_1.1] + echo "First Parameter is [$1]" First Parameter is [fred] + echo "Second Parameter is [$2]" Second Parameter is [bill] + echo "This Process ID is [$$]" This Process ID is  + echo "This Parameter Count is [$#]" This Parameter Count is  + echo "All Parameters [$@]" All Parameters [fred bill bert] + echo "The FLAGS are [$-]" The Flags are [vx]
Looking back at the example script, in the first line of the file there is a special sequence of characters (#!) which the shell will only interpret on the first line. Normally the hash character indicates to the shell that this is the start of a comment and the shell must ignore everything up to the next newline character. However, when on the first line, the shell will go on to read the path to the shell executable program and optionally some shell flags (See - In The Begining). I have added the flags -xv here because they are very useful when debugging. The -v flag is the verbose setting (also available part way through a script by using set -v if required) which forces the shell to output or echo each command it finds in the script as it encounters it. This will allow you to find which particular line in your code has the syntax error, output will stop at this point and the script exits. The -x flag is similar except that it puts a plus sign (+) in front of any command that gets processed. This is not quite the same is -v which will show you the command whether it is processed or not (See - above which shows output from both -v and -x together). If you process a loop structure for instance, the -v will output the whole construct once as it is seen, but the -x will show each pass through the loop too. The path shown on the first line is for the Bourne shell. For C shell use /usr/bin/csh and for Korne shell use /usr/bin/ksh.
The next three lines are my default header. See Design Considerations for information on script style, layout and symbol format.
Next is the body of the script which displays to the terminal or echoes some text strings and some values. You will note I have put each variable/parameter/name(!) inside some square brackets. This is a good way of checking for included blank space within a variables value. I would not expect to see any blanks in any of these variables but when debugging, it's a good idea to check. The first three are positional parameters which will display the parameters following the command name (or script) when executing. The first of these is $0 which is the command (or script) name itself. This is a useful thing to have as you can use this when outputting errors or building logfiles or audit trails. The real input parameters are available from $1 to $9 inclusive. What if you have more than 9 parameters? Well there is a shift feature, which we will cover later (See - Special Commands ), which gives access to parameters above 9. Incidentally, the dollar symbol ($) at the front of all these variables is a request to the shell to substitute the value of the variable at that point. All variable names used in all the shell types need to be prefixed with the dollar if you want the value substituted (See - Parameter Substitution).
Next is an odd looking one called $$ which returns the process id of this script. When UNIX executes a script it will create a process to handle the work and this is its number. It is an integer between 1 (unlikely!) and 32767 on most systems. Every task that is run on a UNIX system has its own process id which is why the number 1 is unlikely. There will be tens (maybe hundreds) of processes already running when you login and you just get the next available. When UNIX runs out of process id numbers, it wraps around and re-uses defunct process id's by starting again at the lowest available number. The $$ parameter is not just a pointless random number generator. It is very useful when creating temporary files for instance, where each instance of the script can create a unique temporary filename based on the process id (or PID).
Then we have $# or the parameter count. This returns an integer number representing the number of positional parameters (the $digits) following the script name. In the Example Basic Shell Script, that would be three, but I have not found a real limit, except when exceeding the UNIX line length. When dealing with counts larger than 1 this is a useful loop control parameter for use with the shift feature (See - Special Commands).
Next we have the $@ parameter which lists out the complete set of positional parameter values found on the command line (excluding $0), a handy way to pass them all on to a sub-script or function.
Lastly I have included the $- parameter which will list out the current flags in use. This parameter is volatile and will be updated to reflect the status of any set commands processed during script execution (See - Flags for a complete listing of invocation flags).