Increasing the performance of the Jacobi-Davidson method by blocking

Melven Röhrig-Zöllner; Jonas Thies; Moritz Kreutzer; Andreas Alvermann; Andreas Pieper; Achim Basermann; Georg Hager; Gerhard Wellein; Holger Fehske

doi:10.1137/140976017

Increasing the performance of the Jacobi-Davidson method by blocking

Melven Röhrig-Zöllner, Jonas Thies, Moritz Kreutzer, Andreas Alvermann, Andreas Pieper, Achim Basermann, Georg Hager, Gerhard Wellein, Holger Fehske

Research output: Contribution to journal › Article › Scientific › peer-review

23 Citations (Scopus)

Abstract

Block variants of the Jacobi-Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm when it comes to computing multiple or clustered eigenvalues. In practice, however, they are typically avoided because the total number of matrix-vector operations increases. In this paper we present the implementation of a block Jacobi-Davidson solver. By detailed performance engineering and numerical experiments we demonstrate that the increase in operations is typically more than compensated by performance gains through better cache usage on modern CPUs, resulting in a method that is both more efficient and robust than its single vector counterpart. The steps to be taken to achieve a block speedup involve both kernel optimizations for sparse matrix and block vector operations, and algorithmic choices to allow using blocked operations in most parts of the computation. We discuss the aspect of avoiding synchronization in the algorithm and show by numerical experiments with our hybrid parallel implementation that a significant speedup through blocking can be achieved for a variety of matrices on up to 5 120 CPU cores as long as at least about 20 eigenpairs are sought.

Original language	English
Pages (from-to)	C697-C722
Journal	SIAM Journal on Scientific Computing
Volume	37
Issue number	6
DOIs	https://doi.org/10.1137/140976017
Publication status	Published - 2015
Externally published	Yes

Keywords

Block methods
High performance computing
Hybrid parallel implementation
Jacobi-Davidson
Multicore processors
Performance engineering
Sparse eigenvalue problems

Access to Document

10.1137/140976017

Cite this

@article{4d453a1a1376424ca3032c75cef2c52e,

title = "Increasing the performance of the Jacobi-Davidson method by blocking",

abstract = "Block variants of the Jacobi-Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm when it comes to computing multiple or clustered eigenvalues. In practice, however, they are typically avoided because the total number of matrix-vector operations increases. In this paper we present the implementation of a block Jacobi-Davidson solver. By detailed performance engineering and numerical experiments we demonstrate that the increase in operations is typically more than compensated by performance gains through better cache usage on modern CPUs, resulting in a method that is both more efficient and robust than its single vector counterpart. The steps to be taken to achieve a block speedup involve both kernel optimizations for sparse matrix and block vector operations, and algorithmic choices to allow using blocked operations in most parts of the computation. We discuss the aspect of avoiding synchronization in the algorithm and show by numerical experiments with our hybrid parallel implementation that a significant speedup through blocking can be achieved for a variety of matrices on up to 5 120 CPU cores as long as at least about 20 eigenpairs are sought.",

keywords = "Block methods, High performance computing, Hybrid parallel implementation, Jacobi-Davidson, Multicore processors, Performance engineering, Sparse eigenvalue problems",

author = "Melven R{\"o}hrig-Z{\"o}llner and Jonas Thies and Moritz Kreutzer and Andreas Alvermann and Andreas Pieper and Achim Basermann and Georg Hager and Gerhard Wellein and Holger Fehske",

year = "2015",

doi = "10.1137/140976017",

language = "English",

volume = "37",

pages = "C697--C722",

journal = "SIAM Journal on Scientific Computing",

issn = "1064-8275",

publisher = "Society for Industrial and Applied Mathematics",

number = "6",

}

TY - JOUR

T1 - Increasing the performance of the Jacobi-Davidson method by blocking

AU - Röhrig-Zöllner, Melven

AU - Thies, Jonas

AU - Kreutzer, Moritz

AU - Alvermann, Andreas

AU - Pieper, Andreas

AU - Basermann, Achim

AU - Hager, Georg

AU - Wellein, Gerhard

AU - Fehske, Holger

PY - 2015

Y1 - 2015

N2 - Block variants of the Jacobi-Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm when it comes to computing multiple or clustered eigenvalues. In practice, however, they are typically avoided because the total number of matrix-vector operations increases. In this paper we present the implementation of a block Jacobi-Davidson solver. By detailed performance engineering and numerical experiments we demonstrate that the increase in operations is typically more than compensated by performance gains through better cache usage on modern CPUs, resulting in a method that is both more efficient and robust than its single vector counterpart. The steps to be taken to achieve a block speedup involve both kernel optimizations for sparse matrix and block vector operations, and algorithmic choices to allow using blocked operations in most parts of the computation. We discuss the aspect of avoiding synchronization in the algorithm and show by numerical experiments with our hybrid parallel implementation that a significant speedup through blocking can be achieved for a variety of matrices on up to 5 120 CPU cores as long as at least about 20 eigenpairs are sought.

AB - Block variants of the Jacobi-Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm when it comes to computing multiple or clustered eigenvalues. In practice, however, they are typically avoided because the total number of matrix-vector operations increases. In this paper we present the implementation of a block Jacobi-Davidson solver. By detailed performance engineering and numerical experiments we demonstrate that the increase in operations is typically more than compensated by performance gains through better cache usage on modern CPUs, resulting in a method that is both more efficient and robust than its single vector counterpart. The steps to be taken to achieve a block speedup involve both kernel optimizations for sparse matrix and block vector operations, and algorithmic choices to allow using blocked operations in most parts of the computation. We discuss the aspect of avoiding synchronization in the algorithm and show by numerical experiments with our hybrid parallel implementation that a significant speedup through blocking can be achieved for a variety of matrices on up to 5 120 CPU cores as long as at least about 20 eigenpairs are sought.

KW - Block methods

KW - High performance computing

KW - Hybrid parallel implementation

KW - Jacobi-Davidson

KW - Multicore processors

KW - Performance engineering

KW - Sparse eigenvalue problems

UR - http://www.scopus.com/inward/record.url?scp=84953304291&partnerID=8YFLogxK

U2 - 10.1137/140976017

DO - 10.1137/140976017

M3 - Article

AN - SCOPUS:84953304291

SN - 1064-8275

VL - 37

SP - C697-C722

JO - SIAM Journal on Scientific Computing

JF - SIAM Journal on Scientific Computing

IS - 6

ER -

Increasing the performance of the Jacobi-Davidson method by blocking

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this