CS 594 - Applications of Parallel Computing

Assignment 5

Due March 3rd, 1999

 

For this assignment, we will be coding an optimized memory-hierarchy-cognizant matrix-matrix multiply routine. For simplicity, we will only require square matrices (worrying about the non-square case, essential for a good library code, can be a bit time-consuming). The goal of the assignment is to:

 

Rewrite your matrix multiply to take advantage of the cache by doing sub-blocking.

Rewrite your matrix multiply using Strassen's method. Use the manufactured version of DGEMM to perform the matrix multiply parts you will need. Also compare the performance of your version of Strassen's matrix multiply with the ESSL version.

Reading: J. Dongarra, P. Mayes, G. Radicati, The IBM RISC System/6000 and Linear Algebra Operations, UT, CS-90-122, December 1990.

http://www.netlib.org/lapack/lawns/lawn28.ps