Invited Talk

Speaker: Andrew Chien

Title: Creating and Preserving Value in Volatile Resources


Volatile resources are surplus cloud resources not consumed by high priority foreground (reserved/on-demand) load. Today, cloud operators provide no statistical characterization of volatile resources. We consider how such statistics could improve user value by studying Amazon’s 608 EC2 Spot Instance types. Results show that as little as two parameters such as (average, 90pctile) can increase user value by 30%. These results are robust over four-fifths (475 of 608) of instance types.

Beyond competitive concerns, cloud operators are reluctant to share volatile resource statistics because they might be considered a service-level agreement (SLA), and thus constrain their ability to serve foreground load. We show that clever resource management can allay such concerns. We study two plausible classes of foreground load changes, showing one class where such a concern is indeed valid and another where it is not. We design two online resource management algorithms that detect foreground load variation and adapt to maintain a statistical SLA. The algorithms not only improve the ability to maintain guarantees and user value but also improve user experience, reducing job failures by 50%. These results apply to the Stable and Transition classes of instance types, which account for nearly all of the instance types (577 of 608).