Command redirection and the command delimiters

When you invoke a command, like ls, you want to see the output and any errors on the terminal screen. However, there will be times when you want to preserve the output of a command or process it further with another command. Or there may be times when the data needed for input is already prepared in a file.

Unix provides a mechanism called redirection that allows you to "redirect" standard input from a file rather than the keyboard and standard output and standard errors to files rather than to the terminal screen.


The most common form of redirection is redirecting standard output.

ls > listing

When this command is invoked, the command interpreter creates the file listing, loads the command to run, and redirects standard output to be transferred to the file named rather than to the screen. Redirection must target a regular file. The filename may be any legally constructed name. The default location for the file is the current working directory but you may include path information in the filename given to create the file in a different directory. Keep in mind, you must have appropriate permissions to create a file in the desired directory.

Take note that the file is created before command is even executed, so if the file already exists as a regular file, the original contents will be destroyed.

The order of the command and redirection specifier is not important and neither is spacing as long as the shell can differentiate the command and the target file.

ls>listing
>listing ls

Both of these will work.

Most shells provide a mechanism to prevent redirection from overwriting or clobbering an existing file. In bash, this is done by setting the variable "noclobber".

set -o noclobber

The option uses minus and the letter o. To deactivate the noclobber feature, specify a plus and the letter o :

set +o noclobber

Exercise

#(Make sure you don't have the file out or pick a different name. At the prompt, run : )

ls > out

cat out

ls -l > out

cat out

#(The contents should have changed. Now set noclobber : )

set -o noclobber

#(Now run ps so the output is completely different. )

ps -f > out

cat out

#(Turn noclobber back off)

ps -f > out

cat out


As mentioned earlier, every time we redirect output to a particular file name, the contents of the file recreated new. Unix provides another form of redirection that will cause the output to be appended to the current contents of the specified file instead. This is referred to as appending output and uses a double greater symbol, >>.

cmd >> collected

Note that there are no spaces between the two > signs.

If the file does not exist, it will be created. In the bash shell, the noclobber mechanism has no effect on appending. In some other shells, if noclobber is active, you cannot redirect to a non-existent file.

It is possible to append ASCII text to a binary file or the other way around. This is not useful in most cases, but the append only requires that the file be writable.


Redirecting the error output.

The Unix system provide several interfaces or file descriptors for communicating between the user and the system. The three primary descriptors are standard input (stdin) referenced by then number zero, standard output (stdout) referenced by 1, and standard error (stderr) referenced by 2. When we look at bash in more detail, we will learn how to set up other file descriptors to use with redirection.

The > and >> implicitly use the standard output file descriptor but explicitly indicating redirection from the stdout file descriptor, 1> and 1>> would work the same. To redirect error messages sent via the standard error file descriptor, you must specify the descriptor number, 2>.

ls -R 2> errors

This allows all valid output to go directly to the screen but sends any error communication to the file "errors".

You cannot redirect standard output to more than one file with either redirection or the append mechanism. However, you can redirect standard output to one file and standard error to a different files.

ls -R > good 2> errors

You can also direct standard output and standard error to the same file. While there are several ways to do this, the best is to use the & to combine both file descriptors.

cmd > file1 2> file1
cmd > file1 2>&1

Both of these will combine information sent the standard output and standard error descriptors in a single file. The second version is considered the preferred method.

Exercise

#(Exercise : try it your self)

#(Make sure you don't have the files out1, out2 or out3 or pick different names. At the prompt, run : )

ls -R ~berezin > out1 2>1

ls -R ~berezin > out2 2>& out2

#( The following will fail to redirect the error messages. )
ls -R ~berezin 2>& out3 > out3

#( Use less to look the files created. )

Make sure you have set up redirection for standard output before redirecting the standard error descriptor to standard out.

In general, the second example is considered the preferred method.


Redirecting input

Unlike output, there is only one file descriptor for input, standard input with a descriptor id of 0. To redirect input from a file rather than standard input, use the less than < symbol.

tr "[a-z]" "[A-Z]" < datain > translated_data

Standard input can also be redirected using double less, <<, or the "here" redirection.

cmd << eofflag

However, it functions differently than the append output >>. When used as the input for a command, << eofflag indicates input is still from stdin, but it also indicates that the end of input will be marked when the word eofflag is encountered on a line by itself.

eofflag may be any word. It may contain spaces if quoted. When used at the bash command prompt, you may use either the word indicated for the eofflag or [ctrl]d to signal end of input.

( Try the following at the prompt )

sort << done
one
two
three
four
five
almost done
and seven
done

Upon entering done and the [enter] key, the list entered should appear on the screen sorted. Notice that it didn't sort when "and almost done" was entered. And the eoflag word is not treated as part of the date input.

Repeat the exercise, but rather than entering the word "done", press [ctrl]d.

The "here" redirection is most effective when used inside a shell script. It allows you to embed the data for the command being run in the shell script with the command.

The following program, called mailer, takes a name specified as the 1st command line argument and inserts in the text following the sed command and then mails it to person specified by 2nd command line argument. Every thing needed except the two arguments are in the shell script program.

#!/bin/bash
# mailer
# specify recipients 1st name as 1st argument and their email address as 2nd
# argument when running.

sed "s/recipient/$1/g" << done | mutt -s "Important" $2
Dear recipient

Email me as soon as possible.
done

To invoke mailer, use :

./mailer john berezin@lx.cs.niu.edu

The sed command reads input from inside the file and edits it, replacing recipient with the specified name from the command line. It then pipes the edited message the mail command which mails it to the email address specified as the 2nd command line argument. We will look at shell scripting in detail later. But you can see the potential for embedding the data in the same file as the commands that use it.


So far, we have looked at invoking a single command using redirection to and/or from files. Another way of moving data from or to a command is to pipe from another command or to one.

The pipe symbol is the |

Piping allows you to take the output of one command and redirect it into the input of another command without the need to create a file to hold the data between the execution of the two programs. The Unix system will buffer any data if the 2nd command runs slower than the 1st.

find / | sort > sortedlist

One of the limitations of redirection is that you can only specify a single file. Piping combined with the cat (concatenate) command allows you access several file for input.

cat file1 file2 file3 file4 | sort > sorted

And the tee command allows you to copy output to several files.

sort lotsofdata | tee sorted1 | tee sorted2 | uniq | tee unique | wc

This command sequence creates two copies of the sorted data, sorted1 and sorted2, a copy of the sorted data with duplicates removed, unique, and finally applies the wc (word count) command to print the character, word, and line count of the same data that was stored in unique.

In general, the number of pipes is limited to the length of the command line buffer and the number of processes a user is allowed to run at one time.

The pipe function works with standard output, not with standard error. Piping error messages vary from shell to shell. In the bash shell, you may pipe error messages by combining redirection of the error descriptor with piping standard output. Try the following :

ls -R ~berezin 2>&1 | less

Note that the redirection of the error descriptor is done before the piping.


With the introduction of piping, we have also introduced the concept of multiple commands on the same line. Besides piping there are a few other methods for specifying and invoking several commands from a single command line.

; - the semi-colon command delimiter. Use the semi-colon to enter and separate several commands on a single command line.

ls -R > listing; ps; who; sort < listing > slisting

The semi-colon is a sequential delimiter. Each command is run in the order it is entered. The first command ls will run to completion, then the ps, then who, and finally sort. Each command runs in the foreground, giving it complete access stdin, stdout, and stderr, unless redirection is specified. Because each command finishes before the next starts, a later command can use a file created by an earlier run command.

& - the ampersand delimiter. Use the ampersand to enter several commands on a single command line that will run at the same time or concurrently.

find . "*.c" > clist 2> /dev/null & sort < flist > flist.s & vi
a1.pas

The preceding command sequence tells the command shell to start the find command looking for any files ending with a .c, storing what it finds in the file clist and sending any errors to /dev/null. As soon as that starts running, the shell starts the sort command to sort input from the flist file and store it in flist.s. Once that starts, the shell invoke vi to edit the file a1.pas in the foreground.

Because only one process or command can be interacting with standard input and output, any commands started with & are run as background process without access to standard input. If a command needs input, the user must provide redirection from a file or the command's execution will be suspended until the user takes action. We will look at process and background job control in a later module.

While standard output and error do not need to be redirected for a background process, failure to do so will allow the output to appear on the screen while you are trying to perform other tasks.

Another issue with concurrent commands (processes) running is backward referencing. If the second command references for input a file being created by the first command and they are running concurrently, it is possible that all data may not be in the referenced file when the second file runs. Sequential execution allows the 1st file to complete before the 2nd starts and piping will not allow the 2nd command to terminate until all data has been sent from the 1st command.

&& - the double ampersand or AND delimiter - The AND is a conditional sequential delimiter. Like the semi-colon, the preceding command must finish before the next command is started. Unlike the semi-colon, the &:& will test the return code of the preceding command and only if the status was successful, will it invoke the next command.

|| - the double pipe or OR delimiter - Similar to the && except that it continues only if the preceding command indicates a failure status.

() - parenthesis. Parenthesis allow you to group commands together and can be used with && and || to allow a more complex command invocation.

For example :

sort < file1 > file1.s && ( mail bob < file1.s ; cp file1.s backup )

This will run sort and if execution is successful, both the mail and the cp commands will be performed. However, the success or failure of the mail command will will not affect the execution of cp command.

We will look at parenthesis again when we examine jobs.