Argonne National Laboratory

AME: An Anyscale Many-Task Computing Engine

TitleAME: An Anyscale Many-Task Computing Engine
Publication TypeConference Paper
Year of Publication2011
AuthorsZhang, Z, Katz, DS, Ripeanu, M, Wilde, M, Foster, IT
Conference NameProceedings of the 6th Workshop on Workflows in Support of Large-Scale Science
Date Published11/2011
Conference LocationSeattle, Washington
Other NumbersANL/MCS-P1947-0911

Many-Task Computing (MTC) is an emerging programming model whose relevance on supercomputers is increasing, driven by applications in biology, economics, and statistics, and by paradigms such as data intensive computations and uncertainty quantification. However, its high inter-task parallelism and data-intensive processing capabilities pose new challenges to existing supercomputer hardware-software stacks. These challenges include resource provisioning; task dispatching, dependency resolution, and load balancing; data management; and resilience. This paper examines the characteristics of MTC applications which create these challenges, and identifies related gaps in MTC middleware for extreme-scale systems. Based on this analysis, we propose AME, an Anyscale MTC Engine, which addresses the scalability aspects of these gaps. We describe the AME framework and present performance results for both synthetic benchmarks and real applications. Our results show that AMEs dispatching performance linearly scales up to 14,120 tasks/second on 16,384 cores with high efficiency. The overhead of the intermediate data management scheme does not increase significantly up to 16,384 cores. AME eliminates 73% of the data transfer between compute nodes and the global filesystem for the Montage astronomy application on 2,048 cores. Our results indicate that AME scales well on todays petascale machines, and is a strong candidate for exascale machines.