In parallel programming, as in other engineering disciplines, the goal of the design process is not to optimize a single metric such as speed. Rather, a good design must optimize a problem-specific function of execution time, memory requirements, implementation costs, maintenance costs, and so on. Such design optimization involves tradeoffs between simplicity, performance, portability, and other factors.

Making informed design decisions about alternatives requires an
understanding of their costs. In this chapter, we show how this
understanding can be developed and formalized in mathematical *
performance models*. These models can be used to compare the
efficiency of different algorithms, to evaluate scalability, and to
identify bottlenecks and other inefficiencies, all * before
* we
invest substantial effort in an implementation. Performance models
can also be used to guide implementation efforts by showing where
optimization is needed.

After studying this chapter, you should know how to develop performance models for parallel algorithms and be able to use these models to evaluate scalability and to choose between alternative algorithms. You also should know how to obtain reliable empirical data and how to use this data to validate models and implementations. Further, you should understand how network topology can affect communication performance, and you should know how to account for these effects in your models. Finally, you should be able to recognize and account for factors other than performance, factors such as implementation costs, that influence design choices.

- 3.1 Defining Performance
- 3.2 Approaches to Performance Modeling
- 3.3 Developing Models
- 3.4 Scalability Analysis
- 3.5 Experimental Studies
- 3.6 Evaluating Implementations
- 3.7 A Refined Communication Cost Model
- 3.8 Input/Output
- 3.9 Case Study: Shortest-Path Algorithms
- 3.10 Summary
- Exercises
- Chapter Notes