K. Yoshii, K. Iskra, R. Gupta, P. Beckman, V. Vishwanath, C. Yu, S. Coghlan, "Evaluating Power Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q," Preprint ANL/MCS-P2095-0512, May 2012. [pdf]
As we continue our quest toward exascale computing, power consumption is becoming a critical factor, along with resiliency and concurrency. Although power requirements of individual system components (e.g., processor, memory) are taken into consideration by vendors during the design phase, actual power consumption of a complete system is an insufficiently studied research area. Estimating the power consumption of a large-scale system is a nontrivial task because of the number of components involved and also because power requirements are affected by the (unpredictable) workloads. What is needed is a power monitoring infrastructure that can provide timely and accurate feedback to system developers and application writers so they can optimize the use of this precious resource.
In this paper, we first summarize our prior power-related experiences and results on Blue Gene/P. Then we outline the new power measurement capabilities of the system on IBM Blue Gene/Q, currently the most energy-efficient platform on the Green500 list. We describe the important characteristics of the power measurement capabilities and the challenges they present. We explain how we successfully implemented our power-profiling code and demonstrated it on Argonne early-access Blue Gene/Q system. Using the profiling code, we characterized power consumption of primitive operations. In preparation for profiling power consumption of real-world applications, we evaluated the accuracy of the power measurement capabilities for short-duration activities.