Turbine: A Distributed-Memory Dataflow Engine for Extreme-Scale Many-Task Applications
|Title||Turbine: A Distributed-Memory Dataflow Engine for Extreme-Scale Many-Task Applications|
|Publication Type||Conference Paper|
|Year of Publication||2012|
|Authors||Wozniak, JM, Armstrong, TG, Maheshwari, K, Lusk, EL, Katz, DS, Wilde, M, Foster, IT|
|Conference Name||Proceedings SWEET 2012|
|Conference Location||Scottsdale, AZ|
Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant program-ming challenge. One approach is to structure applications with an upper-layer of many loosely-coupled coarse-grained tasks, each comprising a tightly coupled parallel function or program. \"Many-task\" programming models such as func-tional parallel dataflow may be used at the upper layer to generate massive numbers of tasks, each of which generates significant tighly-coupled parallelism at the lower level via multithreading, message passing, and/or partitioned global address spaces. At large scales, however, the management of task distribution, data dependencies, and inter-task data movement is a significant performance challenge. In this work, we describe Turbine, a new highly scalable and dis-tributed many-task dataflow engine. Turbine executes a generalized many-task intermediate representation with au-tomated self-distribution, and is scalable to multi-petaflop infrastructures. We present here the architecture of Turbine and its performance on highly concurrent systems.