Argonne National Laboratory

GloudSim: Google Trace Based Cloud Simulator with Virtual Machines

TitleGloudSim: Google Trace Based Cloud Simulator with Virtual Machines
Publication TypeReport
Year of Publication2014
AuthorsDi, S, Cappello, F
Other NumbersANL/MCS-P5225-1114
AbstractIn 2011, Google released a one-month production trace with hundreds of thousands of jobs running across over 12,000 heterogeneous hosts. In order to perform in-depth research based on the trace, it is necessary to construct a close-to-practice simulation system. In this paper, we devise a distributed cloud simulator (or toolkit) based on virtual machines, with three important features. (1) The dynamic changing resource amounts (such as CPU rate and memory size) consumed by the reproduced jobs can be emulated as closely as possible to the real values in the trace. (2) Various types of events (e.g., kill/evict event) can be emulated precisely based on the trace. (3) Our simulation toolkit is able to emulate more complex and useful cases beyond the original trace to adapt to various research demands. We evaluate the system on a real cluster environment with 16×8=128 cores and 112 virtual machines (VMs) constructed by XEN hypervisor. To the best of our knowledge, this is the first work to reproduce Google cloud environment with real experimental system setting and real-world large scale production trace. Experiments show that our simulation system could effectively reproduce the real checkpointing/restart events based on Google trace, by leveraging Berkeley Lab Checkpoint/Restart (BLCR) tool. It can simultaneously process up to 1200 emulated Google jobs over the 112 VMs. Such a simulation toolkit has been released as a GNU GPL v3 software for free downloading, and it has been successfully applied to the fundamental research on the optimization of checkpoint intervals for Google tasks. Copyright c 2013 John Wiley & Sons, Ltd.