Argonne National Laboratory

Evaluation of Topology-Aware Broadcast Algorithms for Dragonfly Networks

TitleEvaluation of Topology-Aware Broadcast Algorithms for Dragonfly Networks
Publication TypeConference Paper
Year of Publication2016
AuthorsDorier, M, Mubarak, M, Ross, R, Li, JK, Carothers, CD, Ma, K-L
Conference Name2016 IEEE International Conference on Cluster Computing
Date Published12/2016
Conference LocationTaipei, Taiwan
AbstractTwo-tiered direct network topologies such as Dragonflies have been proposed for future post-petascale and exascale machines, since they provide a high-radix, low-diameter, fast interconnection network. Such topologies call for redesigning MPI collective communication algorithms in order to attain the best performance. Yet as increasingly more applications share a machine, it is not clear how these topology-aware algorithms will react to interference with concurrent jobs accessing the same network. In this paper, we study three topology-aware broadcast algorithms, including one designed by ourselves. We evaluate their performance through event-driven simulation for small-and large-sized broadcasts (both in terms of data size and number of processes). We study the effect of different routing mechanisms on the topology-aware collective algorithms, as well as their sensitivity to network contention with other jobs. Our results show that while topology-aware algorithms drastically reduce link utilization, their advantage in terms of latency is more limited.