Designing, Building, and Testing a Parallel Processor
Spring, 2004



We're going to create something like MPI and PVM (Message Passing Interface and Parallel Virtual Machine). We want to use existing hardware: the Computer Science Department's workstations and the network that ties them together. We'll connect processes on workstations together using sockets, so that, say, 5-6 workstations can share the processing load--just as MPI and PVM do. We'll be considering different topologies, and looking at scalability and both static and dynamic load balancing. Students will work in small groups. For testing, the emphasis will be on non-numeric parallel algorithms--sorting, graph algorithms, etc. The overall goal is to go through a design and testing process and build a working parallel machine (though the complexity level will be much less than MPI). Prerequisite: CS 360--threads and sockets.

Texts:
Grama, Gupta, Karypis, and Kumar: Introduction to Parallel Computing 2nd ed. (AW) ......this text does a nice job on basic considerations, parallel algorithm design, message passing, graph & sorting algorithms, etc.
Gropp, Lusk, & Skjellum: Using MPI (MIT Press)

overall:  we're assuming that mpi/pvm don't exist, and students
will work in small groups (2-3) writing software that implements
a distributed-memory message-passing parallel prcessor on a number
(e.g. 4+) of our sun workstations.  we'll look at scalability,
reliability, checkpointing, topologies, tcp vs udp vs rudp, etc.
depending on how many of the students have taken my unix network
programming class (and have written extensive socket code) the
students will either: a) implement the socket calls connecting
the machines or b) use VNET (they have config files for vnet
specifying how each machine is connected, and then pass packets
to vnet and specify the destination).  students will be using
posix pthreads.
so

stage 1: basic topology and scalability issues
stage 2: writing socket code or using VNET to set up the underlying
         connection mechanism.
stage 3: distribution of code to all hosts
stage 4: message-passing paradigms:  implementation of some basic
         library functions analagous to mpi_send and mpi_recv, but
         necessarily simpler.
stage 5: more elaborate functions for the libraries.
stage 6: static and dynamic load balance issues.
stage 7: some (primarily) non-numeric parallel algorithms and their
         implementation--some examples in grama, gibbons & rytter,
         tarjan, etc.
stage 8: comparisons and benchmarking difefrent groups' results,
         also (hopefully) comparing results to MPI.
stage 9: ??
overall again--this will be programming-intensive, and will combine
elements of architecture, networking, parallel algorithms, etc.
i would like to have standard message-passing functions--so that
code which runs on group A's machines will also run on group B's.

for my unix network programming class, we cannot reinvent the whole
TCP/IP stack in one semester, but nonetheless, by the end of the
semester the students have encountered many of the real-world
problems and issues that the actual TCP/IP suite faces, and the
students have had to solve many of these same problems.  in a similar
vein, i want the students to face design issues that the people who
built pvm and mpi faced.  writing parallel algorithms for pvm or mpi
is one thing, but having to go through a design process for building
an mpi-like system is something else altogether.