TY - GEN
T1 - Oikonomos
T2 - 34th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2023
AU - Betting, Jan Harm
AU - Liakopoulos, Dimitrios
AU - Engelen, Max
AU - Strydis, Christos
PY - 2023
Y1 - 2023
N2 - The cloud has become a powerful environment for deploying High-Performance Computing (HPC) applications. However, the size and heterogeneity of cloud-hardware offerings poses a challenge in selecting the optimal cloud instance type. Users often lack the knowledge or time necessary to make an optimal choice. In this work, we propose Oikonomos, a data-driven, opportunistic, resource-recommendation system for HPC applications in the cloud. Oikonomos trains a Multi-layer Perceptron (MLP) to predict the performance of a given HPC application, for different input parameters and instance types. It, then, calculates the cost of executing the application on different instance types and proposes the one best-fitting the user's needs. We deployed Oikonomos on a diverse mix of HPC workloads, and found that for all applications, it approached an optimal policy. The optimal instance type was chosen in 90% of the cases for seven out of eight applications, scoring a Mean Absolute Percentage Error (MAPE) consistently below 20%. This demonstrated that Oikonomos can provide a practical, general-purpose, resource-recommendation system for cloud HPC.
AB - The cloud has become a powerful environment for deploying High-Performance Computing (HPC) applications. However, the size and heterogeneity of cloud-hardware offerings poses a challenge in selecting the optimal cloud instance type. Users often lack the knowledge or time necessary to make an optimal choice. In this work, we propose Oikonomos, a data-driven, opportunistic, resource-recommendation system for HPC applications in the cloud. Oikonomos trains a Multi-layer Perceptron (MLP) to predict the performance of a given HPC application, for different input parameters and instance types. It, then, calculates the cost of executing the application on different instance types and proposes the one best-fitting the user's needs. We deployed Oikonomos on a diverse mix of HPC workloads, and found that for all applications, it approached an optimal policy. The optimal instance type was chosen in 90% of the cases for seven out of eight applications, scoring a Mean Absolute Percentage Error (MAPE) consistently below 20%. This demonstrated that Oikonomos can provide a practical, general-purpose, resource-recommendation system for cloud HPC.
KW - cloud computing
KW - deep learning
KW - heterogeneity
KW - high-performance computing
KW - resource recommendation
UR - http://www.scopus.com/inward/record.url?scp=85174824706&partnerID=8YFLogxK
U2 - 10.1109/ASAP57973.2023.00039
DO - 10.1109/ASAP57973.2023.00039
M3 - Conference contribution
AN - SCOPUS:85174824706
T3 - Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors
SP - 188
EP - 196
BT - Proceedings - 2023 IEEE 34th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2023
PB - IEEE
Y2 - 19 July 2023 through 21 July 2023
ER -