PARA'04 State-of-the-Art
in Scientific Computing
June 20-23, 2004 (Home page)

Updated: 7 February 2004

Applying Software Testing Metrics to Lapack

David Barnes and Tim Hopkins
Computing Laboratory, University of Kent,
Canterbury, Kent, CT2 7NF, UK
email: T.R.Hopkins@kent.ac.uk

In software engineering terms testing is an integral activity in the design, implementation and maintenance phases of the software life cycle. Software Software that executes successfully on an extensive, well constructed suite of test cases provides increased confidence in the software and allows changes to and maintenance of the code to be closely monitored for unexpected side effects. An important requirement is that the test suite evolves with the software; data sets that cause changes in the source code should automatically be added to the suite and new tests generated to cover newly added features.

To gauge the quality of a test suite we require quantitative measures of how well it performs. Such metrics are useful to developers in a number of ways; first, they determine when we have achieved a well-defined, minimal level of testing; second, they can reduce the amount (and, thus, the cost) of testing by allowing tests that fail to improve the metric to be discarded and, third, they can provide a starting point for a search for new test cases.

The simplest, glass-box, software testing metric is to measure the statement coverage obtained by executing the complete test suite. Typically we would be looking to generate tests that would cause every statement to be executed at least once; certainly if segments of code are never executed it is difficult to have confidence that the package is bug free. A more stringent metric would be branch coverage will ensure that all branches in the code are executed in both true and false states. Typically, the stricter the metric required the more difficult it is to generate the necessary test sets but the higher the confidence in the software is when it is achieved.

A core component of the Lapack package is its testing material. This has been an important tool in assisting with the implementation of the library on a wide variety of platform/compiler combinations. However, few, if any, measurement appears to have been taken of how well the testing material actually performs in the software engineering sense.

We have used a commercial testing package to obtain statement and branch coverage metrics for a subset of the routines in the Lapack library. Armed with these we have looked at how the number of tests may be both pared down to reduce the testing effort required to achieve the metric values and expanded to provide better coverage.

Initial results indicate that many of the tests in the current suite fail to improve the coverage metrics and may be removed. In addition, there are many sections of code that are never executed by the tests; some of these appear to be simple oversights by the test suite developers and are easily remedied; others almost certainly require expertise in the underlying algorithm and problem domain to generate suitable examples (or, possibly, to show that certain statements can never be exercised).

Home page


2004-02-07