|Title||Evaluating Streaming Strategies for Event Processing across Infrastructure Clouds |
|Publication Type||Conference Paper |
|Year of Publication||2014 |
|Authors||Tudoran, R, Keahey, K, Riteau, P, Panitkin, S, Antoniu, G |
|Conference Name||CCGrid 2014 |
|Conference Location||Chicago, IL |
|Other Numbers||ANL/MCS-P5108-0314 |
|Abstract||Infrastructure clouds revolutionized the way in which we approach resource procurement by providing an easy way to lease compute and storage resources on short notice, for a short amount of time, and on a pay-as-you-go basis. This new opportunity, however, introduces new performance trade-offs. Making the right choices in leveraging different types of storage available in the cloud is particularly important for applications that depend on managing large amounts of data within and across clouds. An increasing number of such applications con- form to a pattern in which data processing relies on streaming the data to a compute platform where a set of similar operations is repeatedly applied to independent chunks of data. This pattern is evident in virtual observatories such as the Ocean Observatory Initiative, in cases when new data is evaluated against existing features in geospatial computations or when experimental data is processed as a series of time events. In this paper, we propose two strategies for efficiently implementing such streaming in the cloud and evaluate them in the context of an ATLAS application processing experimental data. Our results show that choosing the right cloud configuration can improve overall application performance by as much as three times.