Argonne National Laboratory

Pavan Balaji wins Award for Excellence in Scalable Computing

November 3, 2015

Pavan Balaji, a computer scientist in Argonne’s Mathematics and Computer Science Division, has been named a winner of the 2015 IEEE TCSC Award for Excellence in Scalable Computing (Middle Career Researcher). The award, presented by the IEEE Technical Committee on Scalable Computing (TCSC), recognizes up to three individuals for their outstanding, influential, and ongoing contributions in the field 
of scalable computing within 5 to 10 years of receiving their Ph.D. degree.

Balaji is internationally recognized for his work on fine-grained communication, threading and tasking runtime systems, and memory architectures. He leads the MPICH project, which is internationally recognized as the gold standard for open-source MPI implementations. In the past five years, as lead of the Programming Models and Runtime Systems group at Argonne, Balaji headed the design and development of MPICH-3, the first implementation of the MPI-3 standard. MPICH-3 was designed to enable fine-grained, dynamic, and irregular communication – capabilities essential for current and next-generation multi-petascale and exascale systems. Today, nine of the top 10 supercomputers directly used MPICH-3 or one of its derivatives, making it the most prominent MPI implementation in the world. Moreover, Balaji’s leadership in the MPI Forum, particularly as head of the hybrid programming working group, continues to help evolve the MPI standard to meet the challenges of extreme-scale systems. Together with MPICH-3, Balaji also leads efforts on lightweight threading and tasking models for modern operating systems and on heterogeneous memory architectures for the largest supercomputers in the world.

Balaji also organized a new initiative on runtime compatibility for MPI implementations. The project, which involves the producers of several notable MPICH-derived message-passing implementations including IBM, Intel, and Cray, seeks to ensure compatibility between evolving MPI libraries, while allowing the flexibility that individual vendors want. Moreover, Balaji has been collaborating on the MPI-ACC framework, which combines MPI communication capabilities with GPUs. MPI-ACC has redefined how accelerator memory is used in the context of data communication, and both IBM and Cray have adopted this technology for their supercomputers.

In addition, Balaji has applied his innovative solutions to real applications in a way that has had significant impact. His early work on programming models focused on using sockets for high-speed networks, including 10/40 Gigabit Ethernet and InfiniBand. In the mid-2000s, Balaji and a colleague collaborated on groundbreaking research on the ParaMEDIC framework to alleviate the I/O bottlenecks in genome sequencing; ParaMEDIC won the “Storage Challenge Award” at SC 2006 and has been used successfully for enormous sequence search problems. And most recently, Balaji (and his colleagues) devised a method that accurately selects pharmaceutical drug candidates from a large distributed system. The method, which scales the analysis of selected datasets 400X higher than ever before, won the SCALE 2015 challenge. His other awards include the Crain’s Chicago Business annual 40 Under 40 award (2012) and the TEDxMidwest Emerging Leader Award (2013).

Balaji’s impact also is felt through his numerous leadership roles. For example, he was chair of the IEEE Technical Committee on Scalable Computing 2012–2013; and he is general chair of IEEE ScalCom and general co-chair of IEEE Cluster 2015 and IEEE/ACM CCGrid. He is also on the steering committee of several international conferences and on the editorial board of the IEEE Transactions on Cloud Computing.

Balaji’s selection as recipient of the prestigious DOE Early Career Award, in 2012, is the most recent testament to his innovation and impact on the scalable computing field. Balaji is investigating data movement strategies for exascale systems with deep memory hierarchies. Of particular note is his introduction of new functionality exploring two types of memory objects: statically and dynamically allocated. This research already has appeared in by the more than two dozen refereed conference papers.

Balaji said he is deeply honored to receive this award. “So many aspects of high-performance computing – from parallel programming models and runtime systems to next-generation system architecture – present new challenges daily,” Balaji said. “I look forward to continuing collaborations with vendors and researchers in tackling these challenges and helping solve scientific problems at the exascale.”

For more information about the award, see the IEEE TCSC website.