awk provides a number of predefined variables that simplify access and processing of the data file.
The following examples make use of awk syntax and commands not yet covered. I have provided some description in these cases, but will cover these topics in detail in later modules.
In the following exercise , we use : as a field separator and show 5th field of the passlist file, a shortened version of the password file. The 5th field is the Gecos field which usually holds the user's real name.
The BEGIN pattern allows us to set FS before the 1st record is read. Using FS, we indicate that each colon is to be treated as a unique field separator. As a result, when we print the 5th field, an empty Gecos fields will be displayed as empty.
Note that we could also use the option -F":" to initialize FS from the command line.
awk 'BEGIN { FS=":";}; { print $5; }' ~berezin/Data/passlist
Regular expression syntax may be used if alternative delimiters or repetition of a delimiter as a single delimiter needs to be recognized.
In the following exercise, we treat one or more consecutive colons as a single delimiter.
awk 'BEGIN { FS=":+";}; { print $5 }' ~berezin/Data/passlist
In this second exercise, we retrieved the home directory rather than the contents of the Gecos field where the Gecos field was empty (two colons in a row). In this case, it is probably not what we want, so the 1st example is the preferred command.
The exception to this is the space. If FS is set to a single space, then one or more contiguous spaces and/or tabs will be treated as a single delimiter.
We will look at a way around this later.
If you need to remember largest number of fields read in, you will have to write some code to save the contents of NF at its largest to another variable.
For example :
mytext=data&pass=password&mysel=down&oked=on&choice=one&bdoit=button
represents the information returned from a form style web-page. This represents 6 different pieces of data returned as field/value pairs. Each pair is delimited by an ampersand and the elements of the field/value pair are separated by an equal.
By setting RS to &, awk will parse the input into several records consisting of field/value pairs.
We will look at additional predefined variables as various topics come up.