Bjarne S. Andersen
The Danish Meteorological Institute
Copenhagen, Denmark
John A. Gunnels and Fred G. Gustavson
IBM T.J. Watson Research Center
Yorktown Height, USA
John K. Reid
Atlas Centre,
Rutherford Appleton Laboratory, UK
and
Jerzy Wasniewski
Technical University of Denmark
Lyngby, Denmark
Abstract:
We consider the efficient implementation of the Cholesky solution of symmetric positive-definite dense linear systems of equations using packed storage. We take the same starting point as that of LINPACK and LAPACK, with the upper (or lower) triangular part of the matrix being stored by columns. Following LINPACK and LAPACK, we overwrite the given matrix by its Cholesky factor. We consider the use of a hybrid format [1] in which blocks of the matrices are held contiguously and compare this to the present LAPACK code. Code based on this format has the storage advantages of the present code, but substantially outperforms it. Furthermore, it compares favourably to using conventional full format (LAPACK) and using the recursive format of [2].
We will also discuss the kernel Cholesky code that we have developed for factorizing each diagonal block. This uses mini-blocks of size 2x2 in order to make good use of registers. It is written in standard Fortran and yet is remarkably fast on all six of our test platforms.
References:
1. B.S. Andersen, J.A. Gunnels, F.G. Gustavson, J.K. Reid, and J. Wasniewski. "A
Fully Portable High Performance Minimal Storage Hybrid Format Cholesky
Algorithm". Submitted to TOMS.
2. B.S. Andersen, F.G. Gustavson, and J. Wasniewski. "A recursive formulation of Cholesky factorization of a matrix in packed storage". TOMS Vol. 27, No. 2, June. 2001, pp. 214-244.