The Tenth International Workshop on
Accelerators and Hybrid Exascale Systems (AsHES) To be held in conjunction with 34th IEEE International Parallel and Distributed Processing Symposium in New Orleans, Louisiana USA (May 18th, 2020) |
||
AsHES 2020 Virtual Presentation
Keynote (5:30 pm CDT)
Multi-Hetero Accelerated Supercomputing: System, Programming and Applications
Taisuke Boku, Center for Computational Sciences, University of Tsukuba
Abstract:
In the Exa-scale era, one of the most important and tough problems is
how to enhance the sustained performance against the limited power
budget. Traditional multi- or many-core general CPUs are still popular
for easy programming and porting of general applications. However, it
is getting to face to the limit by semiconductor technology limit,
memory capacity per core, network bandwidth, etc.
GPU represents the attached accelerator solution in heterogeneous
computing thanks to its high peak performance ratio to power
consumption, and moreover, recent progress on AI applications such as
TensorFlow ready NVIDIA GPUs. However, GPU's extremely high
performance is provided by wide width of data parallel computation
both in instruction level and core/thread level which requires
thousands of SIMD operations simultaneously executed. Many of success
stories on GPU acceleration depend on their simple parallel execution
and quite low rate of exception (if statements) handling. Another
problem is the interconnection network which relies on CPU-bundle high
performance network such as InfiniBand.
In our research team has been focusing on the FPGA computation, which
is one of the hot topics of new type of accelerators for HPC, however
it is quite difficult to achieve a comparable performance with GPU
especially for SIMD style applications. So, we think that a new
generation of accelerated computing supported by multiple
heterogeneous accelerator platform including several types of ones
together on computation node. The first target is a combination of GPU
and FPGA to provide 360-degree solution with SIMD and pipelined
parallelism depending on the characteristics of each computation part
of a large application. In this talk, I will introduce the current
status of our Multi-Hetero Accelerated System running on University of
Tsukuba, its hardware and software development, and real application
with preliminary performance evaluation.
Bio:
Taisuke Boku received Master and PhD degrees from Department
of Electrical Engineering at Keio University. After his career as
assistant professor in Department of Physics at Keio University, he
joined to Center for Computational Sciences (former Center for
Computational Physics) at University of Tsukuba where he is currently
the director and the HPC division leader.
He has been working there more than 25
years for HPC system architecture, system software, and performance
evaluation on various scientific applications. In these years, he has
been playing the central role of system development on CP-PACS (ranked
as number one in TOP500 in 1996), FIRST, PACS-CS and HA-PACS as the
representative supercomputers in Japan. He is currently the Director
of Center for Computational Sciences, University of Tsukuba, and the
Vice Director of JCAHPC (Joint Center for Advanced HPC) which is a
joint organization by University of Tsukuba and the University of
Tokyo to operate the largest KNL base cluster in Japan, Oakforest-PACS
(25PFLOPS peak performance). He is also the Chair of HPCI Resource
Management and Service Committee for all supercomputer resource
utilization under MEXT HPCI program. He is a member of system
architecture working group of Post-K Computer development. He received
ACM Gordon Bell Prize in 2011.
Opening Statement
1:25 pm - 1:35 pm CDT
Min Si, Argonne National Laboratory
Session One: GPU computing
1:35 pm - 3:15 pm CDT
Session Chair: Simon Garcia de Gonzalo, Barcelona Supercomputing Center
-
Towards automated kernel selection in machine learning systems: A SYCL case study
John Lawson
-
Unified data movement for offloading Charm++ applications
Matthias Diener, Laxmikant Kale
-
Population Count on Intel CPU, GPU, and FPGA
Zheming Jin, Hal Finkel
-
SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems
Seher Acer, Erik G. Boman, Sivasankaran Rajamanickam
Break 3:15 - 3:50 pm CDT
Session Two: FPGAs
3:50 pm - 5:30 pm CDT
Session Chair: Lena Oden, FernUniversität in Hagen
-
Understanding the Performance of Elementary Numerical Linear Algebra Kernels in FPGAs
Federico Favaro, Juan Oliver, Ernesto Dufrechou, Pablo Ezzatti
-
Scalability of Sparse Matrix Dense Vector Multiply (SpMV) on a Migrating Thread Architecture
Brian A. Page, Peter M. Kogge
-
In-depth Optimization with the OpenACC-to-FPGA Framework on an Arria X FPGA
Jacob Lambert, Seyong Lee, Jeffrey Vetter, Allen Malony
-
Performance Evaluation of Pipelined Communication Combined with Computation in OpenCL Programming on FPGA
Norihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi, Tomohiro Ueno, Kentaro Sano, Taisuke Boku
John Lawson
Matthias Diener, Laxmikant Kale
Zheming Jin, Hal Finkel
Seher Acer, Erik G. Boman, Sivasankaran Rajamanickam
Federico Favaro, Juan Oliver, Ernesto Dufrechou, Pablo Ezzatti
Brian A. Page, Peter M. Kogge
Jacob Lambert, Seyong Lee, Jeffrey Vetter, Allen Malony
Norihisa Fujita, Ryohei Kobayashi, Yoshiki Yamaguchi, Tomohiro Ueno, Kentaro Sano, Taisuke Boku
Copyright © ASHES. All rights reserved.