Abstract
A computational Fluid Dynamics (CFD) code for steady simulations solves a set of non-linear partial differential equations using an iterative time stepping process, which could follow an explicit or an implicit scheme. On the CPU, the difference between both time stepping methods with respect to stability and performance has been well covered in
the literature. However, it has not been extended to consider modern high-performance computing systems such as GraphicsProcessingUnits(GPU). In this work, we first present
an implementation of the two time-stepping methods on the GPU, highlighting the different challenges on the programming approach. Then we introduce a classification
of basic CFD operations, found on the degree of parallelism they expose, and study the potential of GPU acceleration for every class. The classification provides local speedups
of basic operations, which are finally used to compare the performance of both methods on the GPU. The target of this work is to enable an informed-decision on the most
efficient combination of hardware and method when facing a new application. Our findings prove,that the choice between explicit and implicit time integration relies mainly on the
convergence of explicit solvers and the efficiency of preconditioners on the GPU.
the literature. However, it has not been extended to consider modern high-performance computing systems such as GraphicsProcessingUnits(GPU). In this work, we first present
an implementation of the two time-stepping methods on the GPU, highlighting the different challenges on the programming approach. Then we introduce a classification
of basic CFD operations, found on the degree of parallelism they expose, and study the potential of GPU acceleration for every class. The classification provides local speedups
of basic operations, which are finally used to compare the performance of both methods on the GPU. The target of this work is to enable an informed-decision on the most
efficient combination of hardware and method when facing a new application. Our findings prove,that the choice between explicit and implicit time integration relies mainly on the
convergence of explicit solvers and the efficiency of preconditioners on the GPU.
Original language | English |
---|---|
Pages (from-to) | 201-217 |
Number of pages | 7 |
Journal | Computers & Mathematics with Applications |
Volume | 74 |
DOIs | |
Publication status | Published - 17 Mar 2017 |
Bibliographical note
Accepted Author ManuscriptKeywords
- Time integration
- GPU
- CFD
- GMRES
- Preconditioning
- ILU