PETSc algebraic solvers run on GPU systems from NVIDIA using CUDA, and AMD and Intel using OpenCL via ViennaCL. This sometimes provides an alternative high-performance, low-cost solution technique.
Quick overview of GPU usage and roadmap in PETSc
- PETSc code will include full implementations of vector and matrix operations (as well as other select operations) using each of
- CUDA with cuBLAS and cuSparse - supported
- HIP with Rocm - in development
- SYCL with MKL - in development
- OpenCL with ViennaCL - supported
- User code may be written in
- CUDA - supported
- HIP - coming soon
- SYCL - comming soon
- OpenCL - supported, but no examples
- Kokkos - supported
You must use petsc main (git branch) for GPUs, do not install the current release.WARNING: Using GPUs effectively is difficult! You must be dedicated and willing to get into the guts of GPU usage if you are serious about using GPUs.
- Installing PETSc to use NVIDIA GPUs (CUDA)
- Installing PETSc to use GPUs independent of the vendor (OpenCL)
- Very out-dated document on how the GPU solvers are implemented in PETSc
- Example that uses CUDA directly in the user function evaluation
- Presentation on some aspects of GPU usage from PETSc
Quick summary of usage with CUDA:
VECCUDAmay be used with
VecSetType()or -vec_type seqcuda, mpicuda, or cuda when
MATAIJCUSPARSEmaybe used with MatSetType or -mat_type seqaijcusparse, mpiaijcusparse, or aijcusparse when
- If you are creating the vectors and matrices with a DM, you can use -dm_vec_type cuda and -dm_mat_type aijcusparse
- The VecType
Quick summary of usage with OpenCL (provided by the ViennaCL library):
VECVIENNACLmay be used with
VecSetType()or -vec_type seqviennacl, mpiviennacl, or viennacl when
MATAIJVIENNACLmaybe used with MatSetType or -mat_type seqaijviennacl, mpiaijviennacl, or aijviennacl when
- If you are creating the vectors and matrices with a DM, you can use -dm_vec_type viennacl and -dm_mat_type aijviennacl
- The VecType
- It is useful to develop your code with the default vectors and then run production runs with the command line options to use the GPU since debugging on GPUs is difficult.
All of the Krylov methods except
KSPIBCGSrun on the GPU.
- Parts of most preconditioners run directly on the GPU. After setup, PCGAMG runs fully on GPUs, without any memory copies between the CPU and GPU.
Some GPU systems (for example many laptops) only run with single
precision; thus, PETSc must be built with the
./configure option --with-precision=single