Abstract
This research focuses on investigating the relative performance of a range of machine learning algorithms, namely the artificial neural network, support vector machine, Gaussian process regression, random forest, and XGBoost, for predicting the undrained shear strength from cone penetration test data. This is to assess how machine learning could help us lower the need for laboratory test data. The training dataset compiles 526 data from 12 regions and the testing dataset consists of 20 data from a polder located close to Leiden in the Netherlands. In addition, k-fold and group k-fold cross-validation strategies are both applied to validate the models. The poor performance of the models during group k-fold cross-validation suggests that, while machine learning techniques can perform well when site-specific data are included during training, they struggle to generalize without site-specific data. This highlights the difficulty of capturing soil heterogeneity and suggests that either machine learning methods should be trained on specific sites for which some data are already available, or much larger training datasets are needed.
Original language | English |
---|---|
Number of pages | 8 |
Publication status | Published - 2023 |
Event | 14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023 - Trinity College Dublin, Dublin, Ireland Duration: 9 Jul 2023 → 13 Jul 2023 https://icasp14.com/ |
Conference
Conference | 14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023 |
---|---|
Abbreviated title | ICASP14 |
Country/Territory | Ireland |
City | Dublin |
Period | 9/07/23 → 13/07/23 |
Internet address |