N. Liu, C. Carothers, J. Cope, P. Carns, R. Ross, "Model and Simulation of Exascale Communication Networks," Journal of Simulation, 2011, . Also Preprint ANL/MCS-P1937-0911, September 2011. [pdf]
Exascale supercomputers will have millions or even hundreds of millions of processing cores and the potential for nearly billion-way parallelism. Exascale compute and data storage architectures will be critically dependent on the interconnection network. The most popular interconnection network for current and future supercomputer systems is the torus (e.g.,k-ary, n-cube). This paper focuses on the modeling and simulation of ultra-large-scale torus networks using Rensselaers Optimistic Simulator System (ROSS). We compare real communication delays between our model and the actual torus network from Blue Gene/L using 2,048 processors. Our performance experiments demonstrate the ability to simulate million-node to billion-node torus networks.a The torus network model for a 16-million-node configuration shows a high degree of strong scaling when going from 1,024 cores to 32,768 cores on Blue Gene/L, with a peak event-rate of nearly 5 billion events per second. We also demonstrate the performance of our torus network model configured with 1 billion nodes on both Blue Gene/L and Blue Gene/P systems. The observed best event rate at 128K cores is 12.36 billion per second on Blue Gene/P. processors.