TY - JOUR
T1 - Hybrid XGboost model with various Bayesian hyperparameter optimization algorithms for flood hazard susceptibility modeling
AU - Janizadeh, Saeid
AU - Vafakhah, Mehdi
AU - Kapelan, Zoran
AU - Mobarghaee Dinan, Naghmeh
PY - 2021
Y1 - 2021
N2 - The purpose of this investigation is to develop an optimal model to flood susceptibility mapping in the Kan watershed, Tehran, Iran. Therefore, in this study, three Bayesian optimization hyper-parameter algorithms including Upper confidence bound (UCB), Probability of improvement (PI) and Expected improvement (EI) in order to Extreme Gradient Boosting (XGB) machine learning model optimization and Extreme randomize tree (ERT) model for modeling flood hazard were used. In order to perform flood susceptibility mapping, 118 historic flood locations were identified and analyzed using 17 geo-environmental explanatory variables to predict flooding susceptibility. Flood locations data were divided into 70% for training and 30% for testing of models developed. The receiver operating characteristic (ROC) curve parameters were used to evaluate the performance of the models. The evaluation results based on the criterion area under curve (AUC) in the testing stage showed that the ERT and XGB models have efficiencies of 91.37% and 91.95%, respectively. The evaluation of the efficiency of Bayesian hyperparameters optimization methods on the XGB model also showed that these methods increase the efficiency of the XGB model, so that the model efficiency using these methods EI-XGB, POI-XGB and UCB-XGB based on the AUC in the testing stage were 95.89%, 96.87% and 96.38%, respectively. The results of the relative importance of the five models shows that the variables of elevation and distance from the river are the significant compared to other variables in predicting flood hazard in the Kan watershed.
AB - The purpose of this investigation is to develop an optimal model to flood susceptibility mapping in the Kan watershed, Tehran, Iran. Therefore, in this study, three Bayesian optimization hyper-parameter algorithms including Upper confidence bound (UCB), Probability of improvement (PI) and Expected improvement (EI) in order to Extreme Gradient Boosting (XGB) machine learning model optimization and Extreme randomize tree (ERT) model for modeling flood hazard were used. In order to perform flood susceptibility mapping, 118 historic flood locations were identified and analyzed using 17 geo-environmental explanatory variables to predict flooding susceptibility. Flood locations data were divided into 70% for training and 30% for testing of models developed. The receiver operating characteristic (ROC) curve parameters were used to evaluate the performance of the models. The evaluation results based on the criterion area under curve (AUC) in the testing stage showed that the ERT and XGB models have efficiencies of 91.37% and 91.95%, respectively. The evaluation of the efficiency of Bayesian hyperparameters optimization methods on the XGB model also showed that these methods increase the efficiency of the XGB model, so that the model efficiency using these methods EI-XGB, POI-XGB and UCB-XGB based on the AUC in the testing stage were 95.89%, 96.87% and 96.38%, respectively. The results of the relative importance of the five models shows that the variables of elevation and distance from the river are the significant compared to other variables in predicting flood hazard in the Kan watershed.
KW - Bayesian hyperparameter algorithms
KW - Extreme Gradient Boostings Kan watershed
KW - Flood hazard
UR - http://www.scopus.com/inward/record.url?scp=85118337520&partnerID=8YFLogxK
U2 - 10.1080/10106049.2021.1996641
DO - 10.1080/10106049.2021.1996641
M3 - Article
AN - SCOPUS:85118337520
SN - 1010-6049
VL - 37 (2022)
SP - 8273
EP - 8292
JO - Geocarto International
JF - Geocarto International
IS - 25
ER -