Bayesian Networks and Data Driven Models for Estimating Extreme River Discharges Case Study: Magdalena-Cauca Basin, Colombia

Ahmed Nasr; Leonardo Alfonso; Oswaldo Morales Napoles

Abstract

Floods are among the most threatening natural hazards that exist and the leading cause of deaths due to natural disasters worldwide. The quantification of hazards related to the increment of water levels and consequent flood extent requires an adequate estimation of extreme discharges. Traditionally, extreme river discharges are obtained by univariate frequency analysis of peak discharges that are obtained either from discharge observations or from hydrological models. However, either of these two options requires an important amount of historical data. The Magdalena-Cauca river basin in Colombia, as in many basins in the developing world, has insufficient amount of available records of discharge observations in several locations. Therefore, the quantification of hazards becomes a challenging problem, particularly in completely ungagged areas. To overcome this problem, we used and proposed different methods that are computationally inexpensive to indirectly estimate extreme discharges from different and easy to obtain data sources. One of them, a stochastic model based on Bayesian Networks (BNs), developed for data-rich basins and previously applied in Europe and the contiguous USA is used in this study. A Bayesian Network is a direct acyclic graph that consists of nodes that represent random variables and arcs that represent the dependence structure between these variables. To validate the performance of the Bayesian Network model in the Magdalena- Cauca basin, two different deterministic data driven models that weren’t explored earlier for this purpose, based on lazy learning (k-nearest neighbours) and eager learning (M5 Model Trees) are proposed and used. In addition, the BN model performance is evaluated with regional frequency analysis. The inputs of all the models utilize the same data, related to topography, land-use characteristics and local climate. In particular, these inputs, seven in total, are: 1) the steepness of the catchment; 2) the catchment area; 3) the annual maximum of daily precipitation and snowmelt; 4) the maximum runoff coefficient of the catchment; 5) the percentage of the catchment area covered by marshes, 6) lakes and 7) built-up areas. All the models establish relations among them and the annual maximum of daily river discharges, in such a way that it can be predicted for different inputs. Results show that all the models perform generally well in estimating annual peak discharges. All the models perform well when used in big sub-catchments and tend to underperform in small ones. Univariate frequency analysis applied on the peak discharges obtained by all the models show that the output of the BN model outperforms other models in estimating extreme discharges with certain exceedance probabilities. In this context, the BN model is used to evaluate the effect of climate change projection scenarios (RCP4.5 and RCP8.5) on extreme river discharges. Interestingly, the results contradict a recent study based on a distributed hydrological model built with free global products.

Original language	English
Article number	EGU2018-19813-2
Number of pages	1
Journal	Geophysical Research Abstracts (online)
Volume	20
Publication status	Published - 2018
Event	EGU General Assembly 2018 - Vienna, Austria Duration: 8 Apr 2018 → 13 Apr 2018 https://www.egu2018.eu/

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Cite this

@article{8b453e5c09324fb286d8884ccd8d5b7a,

title = "Bayesian Networks and Data Driven Models for Estimating Extreme River Discharges Case Study: Magdalena-Cauca Basin, Colombia",

abstract = "Floods are among the most threatening natural hazards that exist and the leading cause of deaths due to natural disasters worldwide. The quantification of hazards related to the increment of water levels and consequent flood extent requires an adequate estimation of extreme discharges. Traditionally, extreme river discharges are obtained by univariate frequency analysis of peak discharges that are obtained either from discharge observations or from hydrological models. However, either of these two options requires an important amount of historical data. The Magdalena-Cauca river basin in Colombia, as in many basins in the developing world, has insufficient amount of available records of discharge observations in several locations. Therefore, the quantification of hazards becomes a challenging problem, particularly in completely ungagged areas. To overcome this problem, we used and proposed different methods that are computationally inexpensive to indirectly estimate extreme discharges from different and easy to obtain data sources. One of them, a stochastic model based on Bayesian Networks (BNs), developed for data-rich basins and previously applied in Europe and the contiguous USA is used in this study. A Bayesian Network is a direct acyclic graph that consists of nodes that represent random variables and arcs that represent the dependence structure between these variables. To validate the performance of the Bayesian Network model in the Magdalena- Cauca basin, two different deterministic data driven models that weren{\textquoteright}t explored earlier for this purpose, based on lazy learning (k-nearest neighbours) and eager learning (M5 Model Trees) are proposed and used. In addition, the BN model performance is evaluated with regional frequency analysis. The inputs of all the models utilize the same data, related to topography, land-use characteristics and local climate. In particular, these inputs, seven in total, are: 1) the steepness of the catchment; 2) the catchment area; 3) the annual maximum of daily precipitation and snowmelt; 4) the maximum runoff coefficient of the catchment; 5) the percentage of the catchment area covered by marshes, 6) lakes and 7) built-up areas. All the models establish relations among them and the annual maximum of daily river discharges, in such a way that it can be predicted for different inputs. Results show that all the models perform generally well in estimating annual peak discharges. All the models perform well when used in big sub-catchments and tend to underperform in small ones. Univariate frequency analysis applied on the peak discharges obtained by all the models show that the output of the BN model outperforms other models in estimating extreme discharges with certain exceedance probabilities. In this context, the BN model is used to evaluate the effect of climate change projection scenarios (RCP4.5 and RCP8.5) on extreme river discharges. Interestingly, the results contradict a recent study based on a distributed hydrological model built with free global products. ",

author = "Ahmed Nasr and Leonardo Alfonso and {Morales Napoles}, Oswaldo",

year = "2018",

language = "English",

volume = "20",

journal = "Geophysical Research Abstracts (online)",

issn = "1607-7962",

note = "EGU General Assembly 2018, EGU 2018 ; Conference date: 08-04-2018 Through 13-04-2018",

url = "https://www.egu2018.eu/",

}

TY - JOUR

T1 - Bayesian Networks and Data Driven Models for Estimating Extreme River Discharges Case Study: Magdalena-Cauca Basin, Colombia

AU - Nasr, Ahmed

AU - Alfonso, Leonardo

AU - Morales Napoles, Oswaldo

PY - 2018

Y1 - 2018

N2 - Floods are among the most threatening natural hazards that exist and the leading cause of deaths due to natural disasters worldwide. The quantification of hazards related to the increment of water levels and consequent flood extent requires an adequate estimation of extreme discharges. Traditionally, extreme river discharges are obtained by univariate frequency analysis of peak discharges that are obtained either from discharge observations or from hydrological models. However, either of these two options requires an important amount of historical data. The Magdalena-Cauca river basin in Colombia, as in many basins in the developing world, has insufficient amount of available records of discharge observations in several locations. Therefore, the quantification of hazards becomes a challenging problem, particularly in completely ungagged areas. To overcome this problem, we used and proposed different methods that are computationally inexpensive to indirectly estimate extreme discharges from different and easy to obtain data sources. One of them, a stochastic model based on Bayesian Networks (BNs), developed for data-rich basins and previously applied in Europe and the contiguous USA is used in this study. A Bayesian Network is a direct acyclic graph that consists of nodes that represent random variables and arcs that represent the dependence structure between these variables. To validate the performance of the Bayesian Network model in the Magdalena- Cauca basin, two different deterministic data driven models that weren’t explored earlier for this purpose, based on lazy learning (k-nearest neighbours) and eager learning (M5 Model Trees) are proposed and used. In addition, the BN model performance is evaluated with regional frequency analysis. The inputs of all the models utilize the same data, related to topography, land-use characteristics and local climate. In particular, these inputs, seven in total, are: 1) the steepness of the catchment; 2) the catchment area; 3) the annual maximum of daily precipitation and snowmelt; 4) the maximum runoff coefficient of the catchment; 5) the percentage of the catchment area covered by marshes, 6) lakes and 7) built-up areas. All the models establish relations among them and the annual maximum of daily river discharges, in such a way that it can be predicted for different inputs. Results show that all the models perform generally well in estimating annual peak discharges. All the models perform well when used in big sub-catchments and tend to underperform in small ones. Univariate frequency analysis applied on the peak discharges obtained by all the models show that the output of the BN model outperforms other models in estimating extreme discharges with certain exceedance probabilities. In this context, the BN model is used to evaluate the effect of climate change projection scenarios (RCP4.5 and RCP8.5) on extreme river discharges. Interestingly, the results contradict a recent study based on a distributed hydrological model built with free global products.

AB - Floods are among the most threatening natural hazards that exist and the leading cause of deaths due to natural disasters worldwide. The quantification of hazards related to the increment of water levels and consequent flood extent requires an adequate estimation of extreme discharges. Traditionally, extreme river discharges are obtained by univariate frequency analysis of peak discharges that are obtained either from discharge observations or from hydrological models. However, either of these two options requires an important amount of historical data. The Magdalena-Cauca river basin in Colombia, as in many basins in the developing world, has insufficient amount of available records of discharge observations in several locations. Therefore, the quantification of hazards becomes a challenging problem, particularly in completely ungagged areas. To overcome this problem, we used and proposed different methods that are computationally inexpensive to indirectly estimate extreme discharges from different and easy to obtain data sources. One of them, a stochastic model based on Bayesian Networks (BNs), developed for data-rich basins and previously applied in Europe and the contiguous USA is used in this study. A Bayesian Network is a direct acyclic graph that consists of nodes that represent random variables and arcs that represent the dependence structure between these variables. To validate the performance of the Bayesian Network model in the Magdalena- Cauca basin, two different deterministic data driven models that weren’t explored earlier for this purpose, based on lazy learning (k-nearest neighbours) and eager learning (M5 Model Trees) are proposed and used. In addition, the BN model performance is evaluated with regional frequency analysis. The inputs of all the models utilize the same data, related to topography, land-use characteristics and local climate. In particular, these inputs, seven in total, are: 1) the steepness of the catchment; 2) the catchment area; 3) the annual maximum of daily precipitation and snowmelt; 4) the maximum runoff coefficient of the catchment; 5) the percentage of the catchment area covered by marshes, 6) lakes and 7) built-up areas. All the models establish relations among them and the annual maximum of daily river discharges, in such a way that it can be predicted for different inputs. Results show that all the models perform generally well in estimating annual peak discharges. All the models perform well when used in big sub-catchments and tend to underperform in small ones. Univariate frequency analysis applied on the peak discharges obtained by all the models show that the output of the BN model outperforms other models in estimating extreme discharges with certain exceedance probabilities. In this context, the BN model is used to evaluate the effect of climate change projection scenarios (RCP4.5 and RCP8.5) on extreme river discharges. Interestingly, the results contradict a recent study based on a distributed hydrological model built with free global products.

M3 - Meeting Abstract

SN - 1607-7962

VL - 20

JO - Geophysical Research Abstracts (online)

JF - Geophysical Research Abstracts (online)

M1 - EGU2018-19813-2

T2 - EGU General Assembly 2018

Y2 - 8 April 2018 through 13 April 2018

ER -

Bayesian Networks and Data Driven Models for Estimating Extreme River Discharges Case Study: Magdalena-Cauca Basin, Colombia

Abstract

UN SDGs

Fingerprint

Cite this