CS140 -- Lab A

Your job is to write the program treebuilder. It takes three arguments:

treebuilder file PRE|POST hash-table-size

File is the name of a file that specifies a tree. The file has the following format: Each line must be one of three types:

  1. No words.
  2. The word "PARENT" followed by two words. This specifies that the first word is a parent of the second.
  3. The word "CHILD" followed by two words. This specifies that the first word is a child of the second.

Your job is to construct the tree that is defined by the file, then print it out using either a preorder traversal (if the second argument is PRE) or a postorder traversal (if the second argument is POST).

If the file does not specify a legal tree, your program must flag that. There are four cases that your program should find:

  1. Line in the wrong format.
  2. Child has multiple parents.
  3. Tree has multiple roots.
  4. Tree has no roots.

It is ok for a parent-child relationship to be specified multiple times, as long as each specification is equivalent.

Note that a node may have any number of children. You should print out siblings in the order in which they were specified as children. For example, the test file Test-03.txt is as follows:

CHILD Jim Dave 
PARENT Dave Terry
PARENT Dave Cindy

This is a 4-node tree, with "Dave" at the root. "Dave" has three children -- "Jim", "Terry", and "Cindy". They should be printed in that order.

Each line of output should be indented by the number of spaces equal to the level of the node in the tree, where the root is at level zero.

As always, when you have questions about what your program should do, consult the executable in the lab directory.

Your program will need to employ a hash table. Why? Because when you receive a name, you need to check to see if a node has already been created for that name. Use separate chaining, and one of the good hash functions from the hash lecture.


Test files


Hints

In my program, each node was a struct containing the following fields:

I had two main data structures -- the hash table, each entry of which contained a dllist of nodes, and a dllist that contained all of the nodes. When I read a PARENT or CHILD line, I would first call find_node() on each of the two names. Find_node() looked up a name in the hash table. If found, it returned the node associated with that name. If not found, it would create a new node for that name with a NULL parent and an empty list of children. It would insert that node into the hash table and onto the dllist containing all of the nodes. It would then return that node.

Once find_node() was called for both names, my code would error check to see if there was a conflict with the child's parent. If not, it would either do nothing (the parent/child relationship was already specified), or go ahead and set the parent link of the child, and append the child to the parent's list of children.

When the input was read, my program used the dllist to find the root of the tree (or if there was an error due to multiple roots or no roots). Then it did the traversal. 142 lines in all (uncommented).