MPI includes variants of each of the reduce operations where the result is scattered to all processes in the group on return.
MPI_REDUCE_SCATTER( sendbuf, recvbuf, recvcounts, datatype, op, comm) | |
IN sendbuf | starting address of send buffer (choice) |
OUT recvbuf | starting address of receive buffer (choice) |
IN recvcounts | integer array specifying the number of elements in result distributed to each process. Array must be identical on all calling processes. |
IN datatype | data type of elements of input buffer (handle) |
IN op | operation (handle) |
IN comm | communicator (handle) |
int MPI_Reduce_scatter(void* sendbuf, void* recvbuf, int *recvcounts, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
MPI_REDUCE_SCATTER(SENDBUF, RECVBUF, RECVCOUNTS, DATATYPE, OP, COMM, IERROR)
<type> SENDBUF(*), RECVBUF(*)
INTEGER RECVCOUNTS(*), DATATYPE, OP, COMM, IERROR
MPI_REDUCE_SCATTER first does an element-wise reduction on vector of elements in the send buffer defined by sendbuf, count and datatype. Next, the resulting vector of results is split into n disjoint segments, where n is the number of members in the group. Segment i contains recvcounts[i] elements. The ith segment is sent to process i and stored in the receive buffer defined by recvbuf, recvcounts[i] and datatype.
Advice
to implementors.
The MPI_REDUCE_SCATTER
routine is functionally equivalent to:
A MPI_REDUCE operation function with count equal to
the sum of recvcounts[i] followed by
MPI_SCATTERV with sendcounts equal to recvcounts.
However, a direct implementation may run faster.
( End of advice to implementors.)