Pipe Dreams Example Pipes Lists Current Shell Sub Shell Redirects Example Redirected Cat Example Indented Cat Example Simple Menu

Pipes, Lists & Redirection

Now, as promised, a closer look at the pipe, list, and redirection characters and their functionality.

Pipe Dreams:

Pipes are a UNIX feature which allows you to connect several commands together in one line and pass data from one to the next much like a chain of firemen sending buckets of water down a line. The data in the bucket is processed by each command and then passed on to the next command without ever coming up for air. This happens because of two things:

Most UNIX commands get input from stdin and pass output to stdout
The pipe symbol (|) directs UNIX to connect stdout from the first command to the stdin of the second command.

So that sounds simple. How does it work in practice and what does a command pipe look like. The example belo shows several pipes made up of groups of commonly piped commands. You will see examples of these syntax structures in most scripts somewhere in the code.

Example pipes

	line_count=`wc -l $filename | cut -c1-8`
	process_id=`ps -ef \
		   | grep $process \
		   | grep -v grep \
		   | cut -f1 -d\	`
	upper_case=`echo $lower_case | tr '[a-z]' '[A-Z]'`

In all cases the pipeline has been used to set a variable to the value returned by the last command in the pipe. In the first example, the wc -l command counts the number of lines in the filename contained in the variable $filename. This text string is then piped to the cut command which snips off the first 8 characters and passes them on to stdout, hence setting the variable line_count.

In the second example, the pipeline has been folded using the backslash and we are searching for the process_id or PID of an existing command running somewhere on the system. The ps -ef command lists the whole process table from the machine. Piping this through to the grep command will filter out everything except any line containing our wanted process string. This will not return one line however, as the grep command itself also has the process string on its command line. So by passing the data through a second grep -v grep command, any lines with the word grep on are also filtered out. We now have just the one line we need and the last thing is to get the PID from the line. As luck would have it, the PID is the first thing on the line, so piping through a version of cut using the field option, we finally get the PID we are looking for. Note the field option delimiter character is an escaped tab character here. Always test the blank characters that UNIX commands return, they are not always what you would think they are.

You should be able to work out the last example yourself based on just the variable names alone. Note the shorthand version of the complete alphabet we used for the tr in Example Function Syntax.

Lists:

Lists look similar to pipes except the pipe symbol '|' is replaced by one of the following list symbols between each command in the list: ';', '&', '&&', or '||', and optionally terminated by ';' or '&'. The semi-colon character is interpreted by UNIX to be a Carriage Return, so a list of commands separated by semi-colons behaves in much the same way as a list of commands on separate lines would behave (hence the name). The difference is that all the list types can be executed in the current shell or in a sub-shell by utilising a slightly different syntax and the output from the completed list can be redirected (see below) or piped (see above). The two syntax forms for shell locations of lists are shown in Current Shell and Sub-Shell below. Hint - It's all in the brackets.

Current Shell:

{ command; command; command; }

Sub-Shell:

( command; command; command; )

The other list symbols change the way the list is processed and they have the following meanings:

& Asynchronously executes the preceding pipeline (as a background task)
&& Execute only if preceding command or pipe terminated with zero exit status (i.e. it exited okay)
|| Execute only if preceding command or pipe terminated with non-zero exit status (i.e. it failed)

Redirects:

The matter of redirecting input and output follows a similar principle to that of piping. The significant differences are that redirects work with files, not commands, and there is a limit to how many you can put on one line - depending on the open file descriptors. Whereas the pipe connects one commands output to the next commands input, a redirect tells a command to put its output into a file or collect its input from a file. There is also a difference in the syntax due to the way UNIX executes its commands.

Normally UNIX will try and find an executable file somewhere on the command path ($PATH variable) which matches the first word on the current command line. So the first command in a pipe gets found, then executed, and data is piped on to the next command. Because redirecting is for files and not commands, a redirect file cannot be placed ahead of the command on the line. Take another look at the last pipe in the above example. Rewriting this command as a redirect would give the following:

tr '[a-z]' '[A-Z]' < $in_file > $out_file

Now you can see the difference. The command must come first, the in_file is directed in by the less_than sign (<) and the out_file is pointed at by the greater_than sign (>). The file descriptor in the in_file can include a wild card to select a number of files. However, the out_file must be unique. Just remember the in_file points its arrow at the command, while the out_file gets pointed at.

The redirect arrows can also be doubled up as in the next example. Here the output from the cat command is a file as before. The double greater_than (>>) directs the output to be appended to the file, if it already exists. If the file does not exist, it is created. The single arrow form, as used above, would always create a new file if there was none there, or overwrite an existing file.

On the input side the double arrow has a slightly different meaning. Here, where the single arrow gets the input from a file, the double arrow gets its input from the shell file that is currently executing. You may be wondering how the shell can tell where the end of this input is and where the continuing shell script restarts. Well, that is the reason for the word following the double less_than sign. The word is a marker that the shell will look out for when reading the input stream. When the word shows up, input will stop and the script will continue.

Example redirected cat

cat >> $out_file << EOF
first line of data
second line of data
more data
the end of the data
EOF

In this case I have used the flag word EOF to indicate the End Of File. However, any word will be acceptable as long as it is unique in the script file. What I generally do is use EOA, EOB, EOC, etc., if I create several files within a script. The capitalisation is not important, but it does make the flags easy to match up as a pair when reading the script.

There is one more thing you can do with this redirected input from a script file. Look at this next example and you will see a minus sign between the double arrows and the flag name:

cat >> $out_file <<-EOF

This instructs the shell to remove leading tab characters from all lines in the input steam including the matched flag word. This handy trick allows you to use a code indent which makes reading much easier as in the following example which is a copy of the previous code segment, but in this new easier to read format.

Example indented cat

cat >> $out_file <<-EOF
	first line of data
	second line of data
	more data
	the end of the data
EOF

Now it looks more like a code block and the flag word stands out too. The section which is indented is easily understood to be the contents of the created file. This feature is not available in the C Shell.

In addition, the redirect arrows can actually redirect input and output, to and from the stdio files, known by their descriptor names (0 and 1). These are the default input and output files for UNIX, usually connected to keyboard (0), display (1) and errors (2). Thus the syntax:

<&digit

uses the file associated with file descriptor digit as the standard input. The same goes for standard output if you reverse the arrow. You can also associate one file with another as in this example:

ls -l   $directory/*.log   >   $out_file   2>&1

Here we see an ls command outputting to out_file. At the end however, is another redirect which is indicating that stderr (file descriptor 2) should also be sent into the stdout (file descriptor 1), which in this case is our out_file. To say the same thing in C Shell the syntax looks simpler, but is harder to read because the descriptor numbers are missing:

ls -l   $directory/*.log   >&   $out_file

Don't forget, you can use this mechanism to create files with any content you like generated from any other combination of commands. Here is an example of a menu file listing the files in a directory. This can then be displayed to the screen and the users choice selected quite simply.

Example simple menu

count=0
for file in `ls -1 $source_directory`
do
    count=`expr $count + 1`
    echo "$count:	$file"  >> $menu_file
done
echo "Please select a number from this menu"
cat $menu_file
read $choice
echo "Thanks"
filename=`grep $choice $menu_file | cut -f2 -d:`
echo "You chose [$filename]"

This example is very simplistic however and will not cope with filenames that contain digits or filename lists longer than 9 lines. Both of these conditions could lead to the grep returning more than one line which is an error condition (See - Simple Menu Functions for a better solution to these problems).

Home Next Preface Introduction Basic Shells Shell Syntax Built-In Commands Command Substitution Startup & Environment Pipes, Lists & Redirection Input & Output Using Files Design Considerations Functions Debugging Putting It All Together Appendix Code Examples

Page 208

This page was brought to you by rhreepe@injunea.demon.co.uk