The term data parallelism refers to the concurrency that is obtained when the same operation is applied to some or all elements of a data ensemble. A data-parallel program is a sequence of such operations. A parallel algorithm is obtained from a data-parallel program by applying domain decomposition techniques to the data structures operated on. Operations are then partitioned, often according to the ``owner computes'' rule, in which the processor that ``owns'' a value is responsible for updating that value. Typically, the programmer is responsible for specifying the domain decomposition, but the compiler partitions the computation automatically.
In this chapter, we introduce the key concepts of data-parallel programming and show how designs developed using the techniques discussed in Part I can be adapted for data-parallel execution. We base our presentation on the languages Fortran 90 (F90) and High Performance Fortran (HPF). Many of the ideas also apply to other data-parallel languages, such as C* and pC++ . F90 provides constructs for specifying concurrent execution but not domain decomposition. HPF augments F90 with additional parallel constructs and data placement directives, which allow many HPF programs to be compiled with reasonable efficiency for a range of parallel computers.
After studying this chapter, you should know to write simple data-parallel programs using HPF. You should also understand how the design principles developed in Part I relate to data-parallel programs, and you should be able to evaluate the impact of HPF's data placement directives on performance. Finally, you should be able to determine when algorithms are suitable for data-parallel implementation.
© Copyright 1995 by Ian Foster