Publications
J. M. Wozniak, T. G. Armstrong, M. Wilde, D. S. Katz, E. Lusk, I. T. Foster, "Scalable Data Flow Programming for Many-Task Applications," Preprint ANL/MCS-P4007-1212, December 2012. [pdf]
Many important application classes that are driving the requirements for extreme-scale systems branch and bound, stochastic programming, materials by design, uncertainty quantification can be productively expressed as many-task data flow programs. The data flow programming model of the Swift parallel scripting language [6] can elegantly express, through implicit parallelism, the massive concurrency demanded by these applications while retaining the productivity benefits of a high-level language.
However, the centralized single-node evaluation model of the previously developed Swift implementation limits scalability. Overcoming this important limitation is difficult, as evidenced by the absence of any massively-scalable data flow languages in current use. The primary challenge is the efficient integration of highly distributed task load balancing with global access to shared data.
