grep/sed Assignment
Part A - grep 40 points
CSCI 330 Unix

Due 1 April 2004 Thursday Night - Close of CSL lab
Both hard copy and email copy (email must be in by close of lab).

For the following problems, create a file that contains the complete grep and regular expression command sequence that solve the following problems. Unless a particular problem indicates otherwise, run the grep statements on the data file /home/max/berezin/Data/grepdata

Put your command sequences in a shell scirpt file that runs the commands under the Bourne shell. To test your solution, you can run the shell script with the commands you have entered and then compare the output to a set of output files in my directory. Your shell script file should start with

Make the file executable and invoke it to run your grep statements

To simplify the grep statement, set up the following variables. Put the path and filename of the data file into a variable. The statement above will force the grep command to run under the bourne shell and the proper syntax for setting the two variables you will need is:

For each problem, write a comment line describing the problem. You may add additional comment lines. A comment line must start with a #. On a separate line, write the grep statement in the form.whereTo check your answers, run your shell script and it will either return an error and/or generate one or more gout files. Use diff or comp to compare your output file and my output file. My output will be in /home/max/berezin/ans. The easiest way to perform the comparison is to write an alias and call it diffit. diffit will contain the following:To test your output, simply run diffit with the problem number.1. For each of the following problems, write a complete grep statement that is runnable. Do not use command line options to solve problem unless specifically told to do so. Find all lines that:

gout.1 contain the string house . String may be in larger word.

gout.2 contain the word house

gout.3 contain the word house or House

gout.4 contain the word house in either upper or lower case or mixed case. Use command option.

gout.5 contain the word house in either upper or lower case or mixed case. Use only regular expressions, no command line options.

gout.6 contain the word house at the end of the line.

gout.7 contain exactly 15 characters of any value on the line.

gout.8 are empty, not even whitespace

gout.9 are blank, but not empty - DO have white space.

gout.10 have the same alpha character repeated three time or more in sequence - such as iii

gout.11 have the same lower case alpha character repeated elsewhere on the line in two other places anywhere on line. - each may be separated by any number of other characters.

gout.12 have a pair of alpha characters repeated elsewhere on line. Treat the pair as a single unit not as two seperate repeating characters.

gout.13 have two alpha characters that are next to each other repeated as a pair in reverse order elsewhere on the line (still next to each other but not neccesarily next to the firt pair). For this, you will need to reference each character seperately.

gout.14 contain a word that appears more than once on the line as a word in both places. May be separate by any number of other characters.

gout.15 contain a sequence of non-white space characters that appears more than once on the line. May be separate by any number of other characters.

gout.16 contain a word that begins and ends with a vowel in upper or lower case. You will not be able to get both single character words and multi-character words with the same grep, so don't worry about the single character words such as I or a

gout.17 contains no whitespace. There must be at least one character on the line. Do not use options.

gout.18 contains only numbers, each consisiting of one or more digits. Lines may be padded with space or tab in front or after number(s). There may be more than one number on the line

gout.19 contains only one number, however this number may be negatively signed. White space padding may exist before and/or after.

gout.20 begins with anything other than t9 or z9 . Use options to simplify. Use $dir/passwd for the file to search.


STOP - do not worry about the following questions. I will use as examples in class.

gout.21 (2) begins with anything other than t9 or z9 . Don't use options but you can use more than one grep and append the output to the single output file. Maintaining order of lines is NOT important. Use $dir/passwd for the file to search.

gout.22 (2) begins with t9 or z0 . Options are ok and you can use more than one grep. Maintain the order of lines by piping output of one grep to the next. Use $dir/passwd for the file to search

gout.23 (2) contains an even number in the third field. Assume the lines you are searching consist of fields seperated by : and may vary in size. Use $dir/passwd for the file to search

gout.24 (2) Apply grep to the file $dir/processes and find all processes whose process id is an even number. Do not use character column count to determine the location of the pid, instead search look for the end of the second field on the line, where each field on the line is delimited by one or more white spaces.

gout.25 (2) Apply grep to the file $dir/processes and find all processes that are deamon processes. Deamon processes are usually given the a name that end with a d e.g named inetd talkd . Do not use character column counts to determine location.