Scripts and Utilities -- Cat/sort/at/grep/find lab


  • cs300
  • Due: Before class on Friday, September 30.
  • Cat lecture notes
    Write the following shell scripts - to submit use the submit script. In your scripts, you may only use the following: You may not write any C programs to help yourself out, nor are you allowed to use any other Unix programs (like sed or perl) in your scripts.

    Finally, your program should handle errors gracefully. For example, if the command line arguments are supposed to be files, and the user specifies a non-file, you should print a descriptive error message on standard error, and either exit or not exit as you see fit.

    Required programs:

    1. primwc: This expects filenames as its command line arguments. For each file, it prints out the filename, plus the number of lines in the file. For this program, you may not use wc.

    2. primspell. We define an alphaword to be a sequence of alphabetic characters surrounded by whitespace (where the beginning and end of a line count as whitespace). primspell accepts one file on the command line. primspell then prints to stdout all alphawords in the file that are not in the dictionary file /usr/share/dict/words. primspell will be grossly inefficient, but it should work.

      If you are having trouble starting this one, look at allalpha and allalpha2. Both of these print out all alphawords from standard input. Note that allalpha2 is much more efficient.

      It's ok to use a temporary file if you want here. Make sure it is deleted when the program is done. Usually I name my temporary files /tmp/$USER.$$.

      Make sure that your primspell doesn't say that subsets of words in /usr/dict/words are ok. For example, the word "ll" should be flagged as a misspelling.

    3. clearout takes the name of a directory as its command line argument. It then deletes all 'core' files reachable from that directory, plus any file whose name starts with "junk" that is reachable from that directory. It prints out the names of all the deleted files, and mail them to you. Then it sets up at so that clearout will be run again at 11:00 pm on the next day. (If today is Tuesday then at will run on Wednesday at 11:00 pm.)

    The recommended optional programs:

    1. linehisto takes file names on the command line. For each file, it should print out:
      • The file name
      • The number of blank lines in the file.
      • The number of lines in the file that have between 1 and 10 characters.
      • The number of lines in the file that have between 11 and 80 characters.
      • The number of lines in the file that have over 80 characters.
      • A blank line.
      Blank lines are lines with no characters.

      For example:

      UNIX> linehisto /jade/homes/buckner/cs291/Cat/greptest
      jade/homes/buckner/cs291/Cat/greptest
      Blank lines:                    0
      Lines with  1 to 10 characters: 3
      Lines with 11 to 80 characters: 2
      Lines with over 80 characters:  0
      
      UNIX>
      
      (Hint: you can call grep on the file more than once).

    2. primspell2 should work like primspell, but it should make sure that it does not emit any duplicates. (Gross hint: the output can be sorted).

    3. nlarge takes an integer n as its command line argument. It then prints out the names and sizes of the n largest files reachable from the current directory. The files should be plain files, not directories. The list should be sorted by size -- largest first. Don't worry about differentiating hard links (i.e. if f1 and f2 are hard linked to each other, and this is the largest file reachable from the current directory, it's ok to print out both f1 and f2 as different files. Also, if there are ties, don't worry about it. In other words if f1 and f2 are as above, and you say ``nlarge 1 '', it's ok to print out just f1 or just f2.