Using Massively Parallel Simulation for MPI Collective Communication Modeling in Extreme-Scale Networks

TitleUsing Massively Parallel Simulation for MPI Collective Communication Modeling in Extreme-Scale Networks
Publication TypeConference Paper
Year of Publication2014
AuthorsMubarak, M, Carothers, CD, Ross, RB, Carns, PH
Conference NameWSC'14 Proceedings of the 2014 Winter Simulation Conference
Date Published12/2014
PublisherIEEE Press
Conference LocationSavannah, Georgia
Other NumbersANL/MCS-P5158-0714
AbstractMPI collective operations are a critical and frequently used part of most MPI-based large-scale scientific applications. In previous work, we have enabled Rensselaer Optimistic Simulation System (ROSS) to predict the performance of MPI point-to-point messaging on high-fidelity million-node network simulations of torus and dragonfly interconnects. The main contribution of this work is an extension of these torus and dragonfly network models to support MPI collective communication operations using the optimistic event scheduling capability of ROSS. We demonstrate that both small-and large-scale ROSS collective communication models can execute efficiency on massively parallel architectures. We validate the results of our collective communication model against the measurements from IBM Blue Gene/Q and Cray XC30 platforms using a data-driven approach on our network simulations. We also perform experiments to explore the impact of tree degree on the performance of collective communication operations in large-scale network models.  
PDFhttp://www.mcs.anl.gov/papers/P5158-0714.pdf