Tentative grep Assignment
Part A - grep 60 points
CSCI 330 Unix

Due 14 Thursday 2006 in class at start of class. Both hard copy and email copy.

For the following problems, create a file that contains the complete grep and regular expression command sequence that solve the following problems. Unless a particular problem indicates otherwise, run the grep statements on the data file /home/max/berezin/Data/grepdata

Put your command sequences in a shell scirpt file that runs the commands under the Bourne shell. To test your solution, you can run the shell script with the commands you have entered and then compare the output to a set of output files in my directory. Your shell script file should start with

Make the file executable and invoke it to run your grep statements

To simplify the grep statement, set up the following variables. Put the path and filename of the data file into a variable. The statement above will force the grep command to run under the bourne shell and the proper syntax for setting the two variables you will need is:

For each problem, write a comment line describing the problem. You may add additional comment lines. A comment line must start with a #. On a separate line, write the grep statement in the form. where
To check your answers, run your shell script and it will either return an error and/or generate one or more gout files. Use diff or comp to compare your output file and my output file. My output will be in /home/max/berezin/ans. The easiest way to perform the comparison is to write an alias and call it diffit. diffit will contain the following: To test your output, simply run diffit with the problem number. 1. For each of the following problems, write a complete grep statement that is runnable. Do not use command line options to solve problem unless specifically told to do so. Find all lines that:

gout.01 contain the string house. String may be in larger word.

gout.02 contain the word house

gout.03 contain the word house or House

gout.04 contain the word house in either upper or lower case or mixed case. Use command option.

gout.05 contain the word house with any of its characters in either upper or lower case or mixed case. Use only regular expressions, no command line options.

gout.06 contain the word house at the end of the line.

gout.07 contain exactly 15 characters of any value on the line.

gout.08 are empty, not even whitespace

gout.09 are blank, but not empty - DO have white space.

gout.10 have the same alpha character repeated three time or more in sequence - such as iii

gout.11 have the same lower case alpha character repeated elsewhere on the line in two other places anywhere on line. - each may be separated by any number of other characters.

gout.12 have a pair of alpha characters repeated elsewhere on line. Treat the pair as a single unit not as two seperate repeating characters.

gout.13 have two alpha characters that are next to each other repeated as a pair in reverse order elsewhere on the line (still next to each other but not neccesarily next to the firt pair). For this, you will need to reference each character seperately.

gout.14 contain a word that appears more than once on the line as a word in both places. May be separate by any number of other characters.

gout.15 contain a sequence of non-white space characters that appears more than once on the line. May be separate by any number of other characters.

gout.16 contain a word that begins and ends with a vowel in upper or lower case. You will not be able to get both single character words and multi-character words with the same grep, so don't worry about the single character words such as I or a

gout.17 contains no whitespace. There must be at least one character on the line. Do not use options.

gout.18 contains only numbers, each consisiting of one or more digits. Lines may be padded with space or tab in front or after number(s). There may be more than one number on the line

gout.19 contains only one number, however this number may be negatively signed. White space padding may exist before and/or after.

gout.20 begins with anything other than t9 or z9. Use options to simplify. Use $dir/passwd for the file to search.

gout.21 (2) begins with anything other than t9 or z9. Don't use options but you can use more than one grep and append the output to the single output file. Maintaining order of lines is NOT important. Use $dir/passwd for the file to search.

gout.22 (2) begins with t9 or z0. Options are ok and you can use more than one grep. Maintain the order of lines by piping output of one grep to the next. Use $dir/passwd for the file to search

gout.23 (2) contains an even number in the third field. Assume the lines you are searching consist of fields seperated by : and may vary in size. Use $dir/passwd for the file to search

gout.24 (2) Apply grep to the file $dir/processes and find all processes whose process id is an even number. Do not use character column count to determine the location of the pid, instead search look for the end of the second field on the line, where each field on the line is delimited by one or more spaces (you may assume no tabs used).

gout.25 (2) Apply grep to the file $dir/processes and find all processes that are deamon processes. Deamon processes are usually given the a name that end with a d e.g named inetd talkd. Do not use character column counts to determine location.

Example :

    root 23960   227  0 09:25:37 ?        0:02 /opt/ssh/sbin/sshd -R

Attack from the end not the beginning. Line may or may not end with a d because options may have other characters.

Because time is an unusual argument, describe the line between the timestamp and the end of the line.