Title  Reducing Communication in Parallel BreadthFirst Search on Distributed Memory Systems 
Publication Type  Report 
Year of Publication  2014 
Authors  Lu, H, Tan, G, Chen, M, Sung, N 
Other Numbers  ANL/MCSP52261114 
Abstract  Breadthfirst search (BFS) is a key operation in dataintensive graph analysis applications. However, for distributed BFS algorithm on large distributed memory systems, data communication often limits the scalability of the algorithm as it costs significantly more than arithmetic computation. In this work, we try to reduce the communication cost in distributed BFS by sieving and compressing the messages. First, we propose a novel distributed directory to sieve the redundant data in collective communications. Then we leverage a bitmap compression algorithm to further reduce the size of messages in communication. Experiments on a 6,144core Intel Westmere based cluster show our algorithm achieve a BFS performance rate of 12.1 billion edge visits per second on an undirected graph of 8 billion vertices and 128 billion edges with scalefree distribution. Compared to the “replicatedcsr” version BFS in Graph500, our algorithm reduces communication cost by 79.0% and gets a speedup of 2.2×.

PDF  http://www.mcs.anl.gov/papers/P52261114.pdf 