News & Announcements
|
August 24, 2012
"Helping applications utilize the high concurrency of multi-petaflop systems"
Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant challenge. To address this challenge, researchers in the Mathematics and Computer Science Division at Argonne National Laboratory and the Computer Science Department at the University of Chicago have developed Turbine, a highly scalable and distributed software engine for managing the execution of large numbers of computational tasks.
Many-task computing refers to high-performance applications composed of tasks implemented in multiple computational patterns—featuring tightly or loosely coupled communication, static or dynamic workflows, and varying resource requirements in terms of computation, memory, and access to storage. Historically, many-task applications typically have been programmed in one of two ways. In the first, the logic is integrated into a single program, and the tasks communicate though MPI messaging or function calls; this approach, while using familiar technology, involves considerable programming effort. In the second approach, a script is written that invokes the tasks, in sequence or in parallel, with each task reading and writing files from a shared (disk-resident) file system; this approach, while convenient for the user, often results in poor performance because it cannot sustain the high task rate needed to efficiently utilize modern, million-core systems.
The Turbine model combines the features of both approaches. It uses the Asynchronous Dynamic Load Balancing Library (ADLB), an MPI-based library, as the scalable load-balancing component; and it is based on the natural semantics of the parallel scripting language Swift but eliminates the bottleneck of Swift’s data dependency model by making script variables accessible from any node (without accessing the file system). Turbine’s success lies in its unique use of massive distributed memory and in decentralizing the management of distribution of tasks, enabling function and expression evaluation to take place on any node of the extreme-scale computer.
Turbine is designed to be a core component of a future generation exascale application development toolkit. Tests of core Turbine functionality on multiple supercomputers (including the Blue Gene, Cray, SiCortex, and Intel MIC) show that the system can achieve the performance required for such extreme cases. The system has been used to solve benchmark problems from related systems. Turbine is in use by the University of Chicago SciColSim scientific collaboration simulator, a graph analysis computation based on mining the text of scientific publications. Current development focuses on integration with two Argonne numerical applications: PIPS, for power grid design, and MINOTAUR, a numerical library. Multiple other applications will be investigated.###
Contact: Gail Pieper, pieper@mcs.anl.gov
