Abstract
This paper presents the GPU porting through OpenACC directives of the Dutch Atmospheric Large-Eddy Simulation (DALES) application, a high-resolution atmospheric model. The code is written in Fortran 90 and features parallel (distributed) execution through spatial domain decomposition. We assess the performance of the GPU offloading, comparing the time-to-solution on regular and accelerated HPC nodes. A weak scaling analysis is conducted and portability across NVIDIA A100 and H100 hardware is discussed. Finally, we show how targeted kernels can benefit from further optimization with Kernel Tuner, a GPU kernels auto-tuning package.
| Original language | English |
|---|---|
| Title of host publication | International Symposium on Parallel and Distributed Processing (IPDPS) |
| Publisher | IEEE |
| Pages | 678-688 |
| Number of pages | 11 |
| ISBN (Electronic) | 979-8-3315-3237-6 |
| ISBN (Print) | 979-8-3315-3238-3 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 39th IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2025 - Politecnico di Milano, Milan, Italy Duration: 3 Jun 2025 → 7 Jun 2025 https://www.ipdps.org/ipdps2025/index.html |
Conference
| Conference | 39th IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2025 |
|---|---|
| Abbreviated title | IPDPS 2025 |
| Country/Territory | Italy |
| City | Milan |
| Period | 3/06/25 → 7/06/25 |
| Internet address |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository as part of the Taverne amendment. More information about this copyright law amendment can be found at https://www.openaccess.nl.Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.