AWK assignment.

points.

Assignment has to work to get extra points.

If the script crashes at any point, it's a zero.

Given a file of expense entries of the following form :

,January,2014-01-06,Food,Aldis,26.76
3048,January,2014-01-06,Home,Bob's Heating,468
3088,March,2014-03-06,Home,Bob's Heating,146.17

Actual file will be ~berezin/Data/bank.data


Header prep : 10 points
Set FS to comma in the BEGIN portion of the script before processing records. This is also a good place to set up an array of month names in chronological (indexed) order.

Google awk split(). This can be useful to initial an array of month names in proper order.

Or you can assign one at a time :

5 points extra if you use split to create the indexed array of month names.


Main program.

The 1st record is a charge on a debit card, hence no check #. The second record is for a written check. Individual fields are comma delimited.

  • 1st field is either a check number or blank.
  • 2nd field is the month transaction occured.
  • 3nd field is date the date in the form yyyy-mm-dd.
  • 4rd field is a general catagory of expense type such as Food or Utility.
  • 5th field is who check is for.
  • 6th is the amount.

    Write script so that it reads the file ~berezin/Data/bank.data and accumulates various sums.

    The summary report will be generatd in the END section of the report.


    Your script should sum the amount paid by check and by debit separately. Do this by testing if field 1 is empty (debit) or has numbers in it (check#).
    10 points.


    Use an associative array to sum expenses on each catagory type. Use field 4, the expense catagory, as the key to select the array element and use field 6, the amount to be accumulated.
    10 points


    Also, provide a summery of monthly totals by testing field 2 for the month. Use an associative array keyed on the month, e.g January
    10 points

    Extra credit 5 points
    Field 3 has the exact date in the format yyyy-mm-dd, the month is the middle 2 digits of the field, so -01- is January, -02- is February, etc. You may use either an associative or indexed array. But when you print out report, give month name.
    See below about using substr and sprintf to cut just the month out of the date field and convert the string to a number.
    10 points

    Use printf to format nicely.
    10 points - remember left justify labels, right justify with 2 digit decimal.


    Most if not all of the output will be done after all data is processed, so most of the print statements will appear in a END actoin block.

    10 points What your output should look like. Make it look nice, but you can change titles and spacing if you want.

    Checks and debits
    Checks :          9478.67
    Debits :          9867.22
    
    Expenses by catagory
    Insurance         3285.00
    Auto               219.23
    Utility           5189.52
    Health             455.02
    Food              7492.91
    Gas                110.27
    Books              170.90
    Computer            37.24
    Utiity             157.76
    Home              2228.04
    
    Expenses by month
    January           2749.11
    February          1387.66
    March             2054.58
    April             1409.43
    May               1269.69
    June              2413.25
    July              2091.96
    August            2303.77
    September         1269.73
    October           1110.81
    November          1085.27
    December           200.63
    

    Extra credit 5 points If you want to try sorting the catagory section, feel free to do so. I covered and example of how to use an indexed array of key values to sort and access an assoicative array in sorted order.


    Tricks :

    Google awk split, awk sprintf, and awk substring for a number of sites with useful examples of these functions.

    Use split to create an array of month names. If you do them in chronological order, they will be indexed in that order.

    Use substr to parse out the "2 digit" month number. This will yield 01, 02, etc. which are strings and not nubmers whch is not exactly correct but close. If you cut carefully, you should get just the 2 digits.

    Use sprintf to convert the convert the string number to a real number. sprintf works like printf except the output can be assigned to a variable rather than going to the screen.

    num = sprintf( "%d", strnum );

    If you take the substring you cut from the date field and run it through sprintf, you will get a real number which can be used to index the monthly cost accumulation.

    You can now use a straight index from 1 to 12 to get both the month's name from the month name array and the costs for that month from the monthly cost array with the same index.