MPI Derived Datatypes Processing on Noncontiguous GPU-Resident Data
|Title||MPI Derived Datatypes Processing on Noncontiguous GPU-Resident Data|
|Publication Type||Conference Paper|
|Year of Publication||2013|
|Authors||Jenkins, J, Dinan, J, Balaji, P, Peterka, T, Samatova, NF, Thakur, R|
Driven by the goals of efficient and generic communication of noncontiguous data layouts in GPU memory, for which solutions do not currently exist, we present a parallel, noncontiguous data-processing methodology through the MPI datatypes specification. Our processing algorithm utilizes a kernel on the GPU to pack arbitrary noncontiguous GPU data by enriching the datatypes encoding to expose a fine-grained, data-point level of parallelism. Additionally, the typically tree-based datatype encoding is preprocessed to enable efficient, cached access across GPU threads.