CS 594
Understanding Parallel Architectures:
From Theory To Practice
Jack Dongarra, Professor, Spring 2002 3 credits
This course is aimed at providing students with a deep knowledge of the
techniques and tools
needed to understand today's and tomorrow's high performance computers,
and to efficiently program them. A mixture of theoretical and practical
material will be provided.
Today's high performance computers range from expensive
highly parallel distributed memory
platforms down to cheap local networks of standard workstations.
But the problems associated with software development
are the same on all architectures: the user needs to recast his or her algorithm
or application
in terms of parallel entities (tasks, processes, threads, or whatever)
that will execute
concurrently. Parallelism is difficult to detect in an automatic
fashion because of data
dependencies.
In many cases, one needs to perform some form of
algorithm restructuring
to expose the parallelism.
Finally, to realize the restructured algorithm in terms of software
on a specific architecture
may be quite complicated.
Fortunately, there are well-established techniques and tools to help:
portable computation libraries such as ScaLAPACK, portable
communication
libraries like MPI, general-purpose task systems such as PVM, or even
data-parallel languages like HPF.
These are the tools that the class targets.
In this course we plan to cover and understand the nuts and bolts of
developing parallel applications.
For instance our study of PVM goes with the foundations of task graph
scheduling,
that of MPI with the complexity analysis of collective communication
operations like broadcasts etc, and that of HPF with data dependence
analysis and automatic parallelization techniques.
In addition we will study performance evaluation and benchmarking on
today's high-performance computers.
Each lecture of the class will make use of simple examples borrowed
from numerical linear algebra. Matrix-matrix multiplication or LU
decomposition will provide the framework needed to illustrate the theory.
Target audience is mainly computer science students. Students from
other
disciplines are welcome, provided they have a computer
science background in the following areas:
machine architecture, algorithm design,
elementary graph algorithms, and complexity
analysis.
Grading will be based:
Tentative outline of the class is the following:
There is no book covering the scope of the class. But for each lecture a comprehensive document will be made available in postscript. A short bibliography is given below.