TY - JOUR
T1 - GHOST
T2 - Building Blocks for High Performance Sparse Linear Algebra on Heterogeneous Systems
AU - Kreutzer, Moritz
AU - Thies, Jonas
AU - Röhrig-Zöllner, Melven
AU - Pieper, Andreas
AU - Shahzad, Faisal
AU - Galgon, Martin
AU - Basermann, Achim
AU - Fehske, Holger
AU - Hager, Georg
AU - Wellein, Gerhard
PY - 2017/10/1
Y1 - 2017/10/1
N2 - While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring “standard” as well as “accelerated” resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi. Any software infrastructure that claims usefulness for such environments must be able to meet their inherent challenges: massive multi-level parallelism, topology, asynchronicity, and abstraction. The “General, Hybrid, and Optimized Sparse Toolkit” (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems. It implements the “MPI+X” paradigm, has a pure C interface, and provides hybrid-parallel numerical kernels, intelligent resource management, and truly heterogeneous parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi. We describe the details of its design with respect to the challenges posed by modern heterogeneous supercomputers and recent algorithmic developments. Implementation details which are indispensable for achieving high efficiency are pointed out and their necessity is justified by performance measurements or predictions based on performance models. We also provide instructions on how to make use of GHOST in existing software packages, together with a case study which demonstrates the applicability and performance of GHOST as a component within a larger software stack. The library code and several applications are available as open source.
AB - While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring “standard” as well as “accelerated” resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi. Any software infrastructure that claims usefulness for such environments must be able to meet their inherent challenges: massive multi-level parallelism, topology, asynchronicity, and abstraction. The “General, Hybrid, and Optimized Sparse Toolkit” (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems. It implements the “MPI+X” paradigm, has a pure C interface, and provides hybrid-parallel numerical kernels, intelligent resource management, and truly heterogeneous parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi. We describe the details of its design with respect to the challenges posed by modern heterogeneous supercomputers and recent algorithmic developments. Implementation details which are indispensable for achieving high efficiency are pointed out and their necessity is justified by performance measurements or predictions based on performance models. We also provide instructions on how to make use of GHOST in existing software packages, together with a case study which demonstrates the applicability and performance of GHOST as a component within a larger software stack. The library code and several applications are available as open source.
KW - Data parallelism
KW - Heterogeneous computing
KW - Large scale computing
KW - Software library
KW - Sparse linear algebra
KW - Task parallelism
UR - http://www.scopus.com/inward/record.url?scp=84989170260&partnerID=8YFLogxK
U2 - 10.1007/s10766-016-0464-z
DO - 10.1007/s10766-016-0464-z
M3 - Article
AN - SCOPUS:84989170260
SN - 0885-7458
VL - 45
SP - 1046
EP - 1072
JO - International Journal of Parallel Programming
JF - International Journal of Parallel Programming
IS - 5
ER -