Next: Chapter Notes
Up: 7 High Performance Fortran
Previous: 7.9 Summary

Write an HPF program to multiply two matrices A
and B
of
size N
N
. (Do not use the MATMUL intrinsic!)
Estimate the communication costs associated with this program if
A
and B
are distributed blockwise in a single dimension or
blockwise in two dimensions.

Compare the performance of your matrix multiplication program with
that of the MATMUL intrinsic. Explain any differences.

Complete Program 7.2 and study its performance as a
function of N
and P
on one or more networked or parallel
computers. Modify the program to use a twodimensional data
decomposition, and repeat these performance experiments. Use
performance models to interpret your results.

Compare the performance of the programs developed in
Exercise 3 with equivalent CC++
, FM, or MPI programs.
Account for any differences.

Complete Program 7.3 and study its performance on one
or more parallel computers as a function of problem size N
and
number of processors P
. Compare with the performance obtained
by a CC++
, FM, or MPI implementation of this algorithm, as described
in Section 1.4.2. Explain any performance
differences.

Develop an HPF implementation of the symmetric pairwise interactions
algorithm of Section 1.4.2. Compare its performance
with an equivalent CC++
, Fortran M, or MPI program. Explain any
differences.

Learn about the dataparallel languages Dataparallel C and
pC++
, and use one of these languages to implement the finitedifference and pairwise interactions programs presented in this
chapter.

Develop a performance model for the HPF Gaussian elimination program
of Section 7.8, assuming a onedimensional cyclic
decomposition of the array A. Compare your model with observed
execution times on a parallel computer. Account for any differences
that you see.

Develop a performance model for the HPF Gaussian elimination program
of Section 7.8, assuming a twodimensional cyclic
decomposition of the array A. Is it more efficient to maintain
one or multiple copies of the onedimensional arrays Row and
X? Explain.

Study the performance of the HPF global operations for different data
sizes and numbers of processors. What can you infer from your results
about the algorithms used to implement these operations?

Develop an HPF implementation of the convolution algorithm described
in Section 4.4.
Next: Chapter Notes
Up: 7 High Performance Fortran
Previous: 7.9 Summary
© Copyright 1995 by Ian Foster