Petaflops Application Project


Project Mission
The Petaflops Applications Group is an offshoot of the larger Argonne Petaflops project designed to focus specifically on gauging the feasibility of porting existing large-scale scientific applications to candidate next-generation hardware/software scenarios. We seek to answer questions such as: what are the main performance bottlenecks for a given application code? how can we build and test simple models to predict performance on new architectures? how can application codes be classified in terms of their performance needs? what tools can be leveraged and what still needs to be built? In answering these questions over a broad application space, we seek to both develop a broad understanding of the needs of real scientific applications as well as establish a methodology for performing similar analyses in the future. This work complements a number of current projects, most notably the PERC project(see link below).

Meetings
Nov. 7: Introduction/Analysis of Flash: Presentation
Nov. 24: Analysis of nek5000: Presentation
Dec. 12: Analysis of Pieper code: Presentation1 (Overview) Presentation2 (Pieper analysis)
Jan. 19: Analysis of Pneo: Presentation1 (Overview) Presentation2 (Pneo analysis)
Jan. 30: Conclusion of Pneo analysis: Presentation
Feb. 6: Analysis of Columbus: Presentation Presentation

Current Project Overview
Our initial approach is to assign an significant application code to each core group member for analysis. As a first step, we are looking to characterize the codes using a variety of general metrics.

BG/L Application Peformance Tests
Last updated 04/26/2005
Application Name Domain Equation/technique Contact Status
Nek5000 General CFD Navier Stokes using spectral element method Paul Fisher Very good scaling to full rack. Single proc performance of mat-mat product is not as good as e.g. power3. Double-hummer helps little. Lots of work trying to optimize mxm. Results here
POP Oceanography primitive equations on sphere -- hydrostatic, Boussinesq Ray Loy Some large scaling tests vs. X1 with promising results. Little single proc analysis. Results here
QMC Nuclear Physics Nuclear binding energy using Monte Carlo Steve Pieper Significant science results. Extensive performance tests. Good scalability as expected for typical problems. Larger nucleus runs now possible (using -qnosave). Single proc performance adequate modulo double-hummer (not being used). Results here
Columbus Quantum Chemistry ab-initio electronic structure (many techniques) Ron Sheperd Not yet ported. Requires Global Arrays (GA) Library
pNeo Neuroscience Hodgkin-Huxley Model Mark Hereld Many tests run using full machine. Still many optimizations to try. No clear story on performance as of yet. Results here
Flash Astrophysics AMR Incompressible hydro, nuclear reactions, EOS, MHD, etc. Katherine Riley Perfect weak scaling for pure hydro. Multigrid tests pending. Results here
mpiBLAST Molecular Biology database search and match Mike Dvorak Carlos Sosa Serial part ported, parallel part still pending. Carlos Sosa from IBM has taken the lead on this
QCD Sub-nuclear physics Lattice QCD (conjugate gradient to invert Dirac) Don Sinclair Very poor single proc performance reported (no use of double hummer, not clear what else). PI has put further testing on hold. Discussion here
Nimrod Fusion Non-ideal MHD with rotation (finite element) Dinesh Kaushik Full machine tests run, awaiting analysis. Results here
Gyro Fusion Eulerian gyrokinetic-Maxwell solver Boyana Norris Ported, small tests run, awaiting analysis
DL_POLY Nano-chemistry Molecular dynamics package with provisions for periodic boundary conditions suitable for slabs and solids. Peter Zapol Super-linear (very good) scaling up to 32 procs. Results here
ASH Solar Phyiscs Spherical harmonic anelasic solver Juri Toomre Port still pending
QGMG Oceanography Forced quasi-geostrophic turbulence in bounded domain using multigrid Andrew Siegel Port being prepared (pvm->mpi)
Petsc FUN3d General CFD Unstructured N-S solver (compressible | incompressible) Dinesh Kaushik Full machine tests. results .
Amber Computational Biology (Molecular Dynamics) molecular mechanical force fields Pratul Argawal Compiled and ran on small proc configurations. vn mode looks promising
GTC Plasma Physics gyrokinetic torioidal, particle-in-cell Stephane Ethier scaled to full machine with good success. submitted full report after workshop
NEMO_3D Nanoelectronic Modelling Still trying to build
LSMS LSMS Electronic Structure Interactions between electrons and atoms in magnetic materials Compiled and ran easily. Perfect weak scaling to 1024 procs.
MDCASK Molecular Dynamics Eqn of Motion with 4th-order Pred-Cor Compiled and ran very easily. Spent quite a lot of time exploring env settings, processor mapping, compile flags and so on.
BOB Climate Shallow Water Equations had to hack namelist real(5,*) - processor performance is not so great, but working on it! communications works
BGC5.0 Biogeochemistry ported, ran, verified solution. Larger runs pending
QCD (MIT) Particle Physics Require lower-level messaging than MPI. Came to collect info. Were able to compile most samples and interfaces
PPanel Fluid Dynamics Boundary Element Method Requires PESSL/Scalapack. Had to serialize this portion at workshop. Got working and scaled linearly
HOMME Climate high-order methods Richard Loft compiled, not running correctly yet.
FDTD Nanophotonics Nanoscience Finite Difference Time Domain Ran on full machine. Very good scaling

Chronology of misc information



Links