CS360 Lecture notes -- Red-Black Trees #2

  • Jim Plank
  • Directory: /blugreen/homes/plank/cs360/notes/Rbtree-2
  • Lecture notes -- plain text: /blugreen/homes/plank/cs360/notes/Rbtree-2/lecture
  • Lecture notes -- html: http://www.cs.utk.edu/~plank/plank/classes/cs360/360/notes/Rbtree-2/lecture.html
    This lecture goes over more fun things you can do with red-black trees.

    sorti1.c

    First, suppose you want to implement "sort -n", which sorts lines of stdin as numbers, but resolves collisions. In other words, if you have two lines "1 b" and "1 a", it will print the first before the second, because it is lexicographically less than the first. Try it out:

    UNIX> cat > f1
    1 b
    1 a
    0 Hmmm
    Jim
    Heather
    < CNTL-D >
    UNIX> sort -n f1 
    0 Hmmm
    Heather
    Jim
    1 a
    1 b
    UNIX>
    
    Note that the lines "Heather" and "Jim" are treated like they are zero when sorting as integers.

    So, this is more complex than before -- we need to do two levels of sorting -- first sorting lines as integers using atoi(), and then resolving collisions by sorting lines as strings using strcmp().

    There are two ways we can do this. The first is in sorti1.c. What this does is use two levels of trees. The first-level tree is a red-black tree that sorts lines by their integer value. In other words, it is keyed by atoi(s), and uses rb_inserti() to do the insertion. The v.val field is another red-black tree, which we call a second-level tree, that contains all the strings with that integer value, sorted lexigraphically.

    In other words, for the file f1 above, the first-level tree will have two nodes -- one with key 0, and one with key 1. The node with key 0 will have a v.val field that is another red-black tree with three nodes: "0 Hmmm", "Heather", and "Jim". The node with key 1 will have a v.val field that is a red-black tree with 2 nodes: "1 a" and "1 b". When we go to print out the file, we traverse the first-level tree. On each node of that tree, we traverse the second level tree and print out the string, which is in the k.key field.

    Note that processing a line is a little more complex. First, you look for atoi(s) in the first-level tree. If it is not found, you create a node for it whose v.val field is a new rb-tree. Then you insert the string into this second-level tree.

    Try this out on the file randfile. Does it work correctly? See how the output differs from mysorti.c in the previous lecture.


    sorti2.c -- passing a function to rb_insertg()

    The second way to effect "sort -n" this is to use just one tree, but define a different comparison function. When we use rb_insert(), strings are inserted into the tree using strcmp() as the comparison function. When we use rb_inserti(), integers are inserted into the tree using standard inequality (<, >, =) for the comparison. There is a third function, rb_insertg(), which allows you to pass a comparison function as an argument, and that is used to perform the insertion.

    Specifically, the function must take two arguments (char *k1, char *k2), (actually, k1 and k2 should be (void *)'s. in other words they are just pointers) and returns:

    For this program, we write a comparison function atoicmp() which compares two strings using atoi(), and if atoi() says that they are equal, then it uses strcmp(). With this mode of insertion, we only need one tree, and thus just have to do a simple tree traversal to print out the sorted file: The code is here. rb_find_gkey_n() works with rb_insertg() just like rb_find_ikey_n() works with rb_inserti(), and rb_find_key_n() works with rb_insert().

    read_roster.c

    Look at read_roster.c. It goes through the following steps. First, it opens the roster file, and makes a new red-black tree. Next, it reads in each line of the roster file, which is in the following format: last name, first name, height, position, year, team, and home town. After reading in each line, it inserts a new node into the rb-tree, keyed on the last name, and containing the struct with all of this information in the v.val field.

    Once the file is read in, it prompts for a last name. When one is entered, it is looked up in the rb-tree. If found, the roster entry for that player is printed out. If not found, an error statement is printed. This shows how to use a rb-tree to perform logarithmic time searching. Try it out to see how it works.