Leading-edge data analytics and visualization enable breakthrough science on Argonne's Blue Gene/P

April 10, 2009

Most science applications that run on large-scale systems like the IBM Blue Gene/P Intrepid at the Argonne Leadership Computing Facility (ALCF) generate huge volumes of data that represent the results of the calculations. To understand those results after the run has completed, it is essential to rapidly explore the output data and convert it to a visual representation.

Data analytics and visualization at this scale is made possible through one of the world's largest installations of NVIDIA Quadro Plex S4 external graphics processing units (GPU). Nicknamed Eureka, this installation allows researchers to explore and visualize the data they produce with Intrepid at the ALCF. The powerful installation provides more than 111 teraflops and more than 3.2 terabytes of RAM.

"Using the NVIDIA Quadro Plex S4 visual computing system as the base graphics building block, Eureka delivers a quantum leap in visual compute density, enabling breakthrough levels of productivity and capability in visualization and data analysis," notes Craig Dunwoody, CEO of GraphStream, Inc. (Belmont, Calif.), the supplier of scalable computer systems that provided Eureka.

The cost-effective approach being used takes four very high-end graphics cards and places them in a 1U "pizza box." A very dense configuration, this solution handles all the power and cooling issues associated with the graphics cards. An alternative configuration using 4U servers with two cards each would take 10.5 racks to match the same number of graphics cards that the Eureka approach provides in just four racks.

The base server building block is the SuperMicro 6015-UR. The S4 attaches to a server on either side of it, forming a "sandwich." To the servers, it appears as if they have two Quadro FX5600 graphics cards inside of them. While there are small system disks in the server, all of the data comes from the large storage system over the network.

Economical, low-latency modular switches represent the heart of the data-management system. The nine-switch complex supports up to 2,048 connections, each of which simultaneously exchanges data at roughly 1 billion bytes per second. The storage system offers a bank of more than 10,000 disk drives that will send and receive data from the Blue Gene/P's more than 100,000 processors. Altogether, this system can deliver nearly 80 billion bytes per second to and from the disk — the equivalent of transferring the content of 100 full CDs every second!

Providing Visualization for DOE's INCITE Projects

The ALCF's Intrepid provides resources for the U.S. Department of Energy's (DOE) Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, which supports computationally intensive projects from industry, scientific researchers, and research organizations.

Using software developed both at Argonne and externally, visualization experts have visualized data with Eureka for DOE INCITE projects focusing on turbulent thermal transport in sodium-cooled nuclear reactor cores, cardiac rhythm disorders, and Type Ia supernovae, which are among the brightest and most powerful exploding stars in the universe.

Ensuring Safe, Clean Nuclear Energy

Researchers are carrying out large-scale numerical simulations of turbulent thermal transport in sodium-cooled nuclear reactor cores on Intrepid. These simulations will enable researchers to gain an understanding of the fundamental thermal mixing phenomena within advanced recycling reactor cores, which can lead to improved safety and economy of these pivotal designs.

The computations are based on the Nek5000 code, which simulates fluid flow, convective heat and species transport, and magnetohydrodynamics in general 2-D and 3-D domains. A singular feature is the code's ability to scale to the large processor counts that characterize petascale computing platforms. Nek5000 was recognized with the 1999 Gordon Bell prize for algorithmic quality and sustained high performance on 4,096 processors of the ASCI-Red.

Researchers have simulated wire-wrapped fuel rods with 7-, 19-, and 37-pin bundles. Current computations for 217-pin bundles are the largest to date with Nek5000, involving several million spectral elements and nearly a billion gridpoints in an unstructured domain. The scale of these computations has necessitated development of a new parallel strategy for solving the coarse-grid problem that is central to the efficiency of Nek5000's multigrid solvers. The new solver employs algebraic multigrid, using Nek5000's existing communication kernels, and results in sustained parallel efficiencies of ~60% for P=65,536 with only 3,700 points per processor.

"Eureka provides a vital link between simulation and analysis by allowing scientists to probe and interrogate their data in an interactive manner," explains Paul Fischer, an Argonne computational scientist who conducts this research. Since Eureka and Intrepid share a disk, there is no need to move data between machines. One of the primary tools used with this data is the open-source project VisIt, a production visualization and analysis tool designed to handle massive data sets such as these. VisIt is directly supported by both AFCI and SciDAC, among others. It has a unique contract-based system that allows it to adaptively apply optimizations and also scale on large numbers of processors. The primary visualization technique being used is volume rendering, which uses a combination of color and transparency to allow the entire three-dimensional volume to be viewed.

Preventing Cardiac Rhythm Disorders

Catastrophic rhythm disturbances of the heart are a leading cause of death in the United States. Treatment and prevention of cardiac rhythm disorders remain difficult because the electrical signal that controls the heart's rhythm is determined by complex, multiscale biological processes. However, recent advances in experimental technologies have allowed for more detailed characterizations of normal and abnormal cardiac electrical activity.

Researchers are using DOE INCITE allocations on the ALCF's Blue Gene/P to rapidly test hypotheses for the initiation and maintenance of rhythm disorders. These large-scale computer simulations represent a promising tool to help identify the underlying electrical mechanisms for dangerous arrhythmias and determine the effects of interventions, such as drugs, that may prevent or exacerbate these arrhythmias. The results of these simulations may help elucidate mechanisms of heart rhythm disorders that pose a significant health risk to the general public. An improved understanding of these disorders will help lead to safer and better treatments for patients.

Several different visualization applications have been used to investigate this data, including one based on vl3, a parallel volume rendering library developed at Argonne and The University of Chicago. Vl3 leverages the power of advanced graphics cards, such as those in Eureka, to accelerate the rendering process. These three-dimensional renderings are key to enabling researchers to explore and gain insight from this data.

Illuminating Scientists' Knowledge of the Universe

Researchers are studying critical aspects of Type Ia supernovae, among the brightest and most powerful exploding stars in the universe. Type Ia create many of the elements from which we are made and are important for measuring distances in the universe.

Using the FLASH code on Intrepid, they are seeking definitive answers to the questions: Is buoyancy-driven turbulent nuclear burning due primarily to large-scale or small-scale features of the flame surface? and At what physical conditions does turbulence tear apart the flame? In carrying out these two studies, they will build on their success in conducting the largest homogeneous, isotropic, weakly compressible turbulence simulation done to date.

The results of the two studies have the potential to produce a major paradigm shift in the Type Ia field. The results of the first study will eliminate one of the largest uncertainties in simulating Type Ia. The results of the second study will determine whether the transition from the flamelet burning regime to the distributed burning regime can take place in Type Ia with profound implications for two of the four current Type Ia models. The results of both studies will be used to treat buoyancy-driven turbulent nuclear burning more accurately in the Flash Center's whole-star, three-dimensional simulations of Type Ia.

Eureka has been used to help analyze this data. For the first time, the group has been able to manipulate a 46-GB timestep in real time, using the VisIt tool. Before Eureka arrived at Argonne, this was not possible, given existing computational analysis resources. Being able to do this will greatly change the way scientists interact with their data. At the same time, Eureka has also greatly sped up the overall analysis process when it comes to generating scientific animations of the results. The process that used to take a week of computing can now be done in just a few days.

The image and video accompanying the article represent the turbulent flow of coolant into a mock-up of the upper plenum of an advanced recycling nuclear reactor. The colors indicate the speed of the fluid, with red representing regions of high velocity and blue representing regions of low velocity.