PARA'04 State-of-the-Art
in Scientific Computing
June 20-23, 2004 (Home page)

Updated: 12 June 2004

Hybrid Parallelization of CFD Applications using Dynamic Thread Balancing

Alexander Spiegel, Dieter an Mey and Christian Bischof
High Performance Computing Aachen University of Technology,
Germany
emails: {anmey,bischof}@rz.rwth-aachen.de

SMP Clusters with fat nodes offer an interesting capability for large hybrid applications (MPI + OpenMP/ Autoparallel): If the MPI parallelization is not well load balanced, the number of threads can be increased in order to speed up busy MPI processes and decreased to slow down idle MPI processes, if these processes reside on the same SMP node.

It may not always be easy to find out the optimal distribution of threads to MPI processes, and also the optimal distribution may change in the course of the runtime of an application. Therefore we developed a library (DTB) which does this adjustment automatically. It uses the PMPI interface and thus wraps the MPI library calls of the application. The steering mechanism has to be triggered by calls of the routine MPI_Pcontrol in the application program at suitable spots. Between each successive calls to MPI_Pcontrol, the execution time of the MPI processes spent outside of calls to the MPI library is measured and evaluated to adjust the number of threads. If the DTB library is not linked, the calls to MPI_Pcontrol act as dummy calls.

Home page


2004-06-12