Hierarchical and Nested Krylov Methods for Extreme-Scale Computing
|Title||Hierarchical and Nested Krylov Methods for Extreme-Scale Computing|
|Publication Type||Journal Article|
|Year of Publication||2012|
|Authors||McInnes, LCurfman, Smith, BF, Zhang, H, R. Mills, T|
The solution of large, sparse linear systems is often a dominant phase of computation for simulations based on partial differential equations, which are ubiquitous in scientific and engineering applications. While preconditioned Krylov methods are widely used and offer many advantages for solving sparse linear systems that do not have highly convergent, geometric multigrid solvers or specialized fast solvers, Krylov methods encounter well-known scaling difficulties for over 10,000 processor cores because each iteration requires at least one vector inner product, which in turn requires a global synchronization that scales poorly because of internode latency. To help overcome these difficulties, we have developed hierarchical and nested Krylov methods in the PETSc library that reduce the number of global inner products required across the entire system (where they are expensive), though freely allow vector inner products across small subsets of the entire system (where they are inexpensive) or use inner iterations that do not invoke vector inner products at all. We introduce the hierarchical FGMRES method, or h-FGMRES, and we demonstrate the impact of two-level h-FGMRES with a nonlinear preconditioner on the PFLOTRAN subsurface flow application. We also demonstrate the impact of nested BiCGStab with a linear Chebyshev preconditioner. These hierarchical and nested Krylov methods significantly reduced overall PFLOTRAN simulation time on the Cray XK6 when using 10,000 through 224,000 cores through the combined effects of reduced global synchronization due to fewer global inner products and stronger inner hierarchical or nested preconditioners.