next up previous
Next: Future Development Up: Distributed Results Previous: Computing environment

Scalability, Speedup, and Efficiency

 

Three trials of the same set of scenarios were performed using one host for LUCAS and 4, 8, 12, 16, and 20 hosts for pLUCAS. The elapsed wall-clock time for program execution found in Figure 13 and Appendix D were compared and relative speedups are shown in Figure 14. These times do not include the minimal 1-2 minute one-time setup overhead for pLUCAS found in Table D.1.

  
Figure 13: Average wall-clock execution time for pLUCAS on multiple hosts

  
Figure 14: Average speedup factor for pLUCAS vs. serial LUCAS

Speedup factor is defined as , [14] where is the elapsed wall-clock time, and n is the number of nodes (or hosts in PVM). Figure 14 shows an asymptotic behavior for speedup: for a small number of hosts the speedup is very dramatic, but as more and more hosts are added, the relative speedup becomes increasingly less. The speedup appears to approach a factor of approximately 11 over the serial version. Naturally, a linear speedup would be ideal, but is not realized due to increasing overhead and the communication bottleneck inherent in a master-servant model of parallelism.

From the speedup factors, the efficiencies for the various number of processors is easily determined. Efficiency is calculated by , where S is the speedup factor and n is the number of hosts. Figure 15 shows that the relative average efficiency steadily decreases as more hosts are added to the virtual machine.

  
Figure 15: Average efficiency of pLUCAS vs. serial LUCAS

As the speedup factor approaches its asymptote, the efficiency plummets. This means that small gains in speed come at the cost of inefficient machine use.

Clearly pLUCAS is scalable for a moderate number of hosts (). Running pLUCAS on a much larger number of machines, however, would not necessarily be beneficial as speed increases would likely be small. Porting the program in its current form to a supercomputer without concurrent I/O would also be rather impractical because pLUCAS is an I/O-intensive application. It would be reasonable, however, if each node had access to either a local disk or a shared, striped disk array with parallel I/O. This is why PVM made such an excellent choice as a parallel platform for pLUCAS.



next up previous
Next: Future Development Up: Distributed Results Previous: Computing environment



Michael W. Berry (berry@cs.utk.edu)
Wed Aug 16 10:48:40 EDT 1995