Scripts and Utilities -- Awk lab


  • Jim Plank
  • Due: 3:30 PM, Tuesday, July 8.
  • This file: http://www.cs.utk.edu/~plank/plank/classes/cs494/494/labs/Awk/lab.html
  • Awk lecture notes
  • Guest TA: Jim Plank(plank@cs.utk.edu).
    Write the following programs as shell or awk scripts. When you're done, bundle them all up with shar and mail them to the Guest TA. In your shell scripts, you may only use the following: Make sure that they get included in the shar file. You may not write any C programs to help yourself out, nor are you allowed to use any other Unix programs (like perl) in your scripts.

    Finally, your program should handle errors gracefully. For example, if the command line arguments are supposed to be files, and the user specifies a non-file, you should print a descriptive error message on standard error, and either exit or not exit as you see fit.

    1. square: This script takes a number as its command line argument. Call this number n. The script prints out lines of the form ``i i*i'' for each i from zero to n. This should work substantially faster than square from the shell lab.

    2. flip12: For each line of standard input, this script prints out each field separated by a space. If the line has more than two fields, it reverses fields one and two.

    3. calcavg is called as follows:
      calcavg [ -c n ] [ files ]
      
      If files are specified, then calcavg works on all files. If no files are specified, then calcavg works on standard input. Without the -c option, calcavg treats every word of its input as a number, and prints out the average of all the numbers. If -c is specified, then it only averages numbers in column n (where 1 is the first column, unlike sort). If there are fewer than n columns on a line, then that line is ignored. If multiple files are specified, then calcavg calculates one average for all the files, not one average per file.

    4. golf: This takes golf score file on standard input, where each line is of the form:
      golfer-name -- score
      
      Golfer names can be any number of words. The score file should not be assumed to be sorted. If the score is ``missed cut'' then the golfer did not finish the tournament. Examples of score files are usopen and masters.

      Your job is to format the output better. You need to print one line per golfer, where each line is: the golfer's name padded to 25 spaces, the golfer's score (use ``missed'' if the golfer missed the cut), and the number of strokes above the best golfer's score. If a golfer missed the cut, then his last column should be the maximum number of strokes above the best golfer's score, plus one. (Note, low score is good in golf). Finally, the output should be sorted by score, with the cut missers printed last.

      You should see the example program for an example of how output should look.

      Here's a hint, for those who want one.

    5. ckpproc is a program that processes data from a checkpointing program. Look at the files BASE, INC, OCD1024, OCD2048, SCD1024, SCD2048, SCD4096 and SEQ. These are output files from a checkpointer. What the checkpointer does is force an application to save its state in ``checkpoint'' files periodically. The output file has lots of information in it, but it is basically unreadable. Your job is to write ckpproc and make it readable. Specifically, you need to print out, given a file:

      • The number of times the application was executed.
      • The average running time of the application in seconds.
      • The average number of checkpoints per test.
      • The average checkpoint size in megabytes (note that a megabyte is 1024*1024 bytes, not 1,000,000 bytes).

      The relevant lines in an input file are:

      • A line beginning with ``T0:. These lines are generated before and after the application is executed, and contains the time (number of seconds since Jan 1, 1970) when the line was generated. Thus, each pair of these lines can be used to determine the application's running time.
      • A line beginning with ``BYTES_WRITTEN''. One of these lines is generated whenever a checkpoint is taken. The second word on the line is the checkpoint size in bytes.

      That's all you need. Go to it, and make sure that it works for all of the above input files. Try the example to see what the output should look like.


    Working Examples

    Working examples are in the directory /home/cs494/labs/Awk/programs. You can run these only on kenner.