A new Bayesian approach for managing bathing water quality at river bathing locations vulnerable to short-term pollution

Wolfgang Seis*, Marie Claire Ten Veldhuis, Pascale Rouault, David Steffelbauer, Gertjan Medema

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

19 Downloads (Pure)

Abstract

Short-term fecal pollution events are a major challenge for managing microbial safety at recreational waters. Long turn-over times of current laboratory methods for analyzing fecal indicator bacteria (FIB) delay water quality assessments. Data-driven models have been shown to be valuable approaches to enable fast water quality assessments. However, a major barrier towards the wider use of such models is the prevalent data scarcity at existing bathing waters, which questions the representativeness and thus usefulness of such datasets for model training. The present study explores the ability of five data-driven modelling approaches to predict short-term fecal pollution episodes at recreational bathing locations under data scarce situations and imbalanced datasets. The study explicitly focuses on the potential benefits of adopting an innovative modeling and risk-based assessment approach, based on state/cluster-based Bayesian updating of FIB distributions in relation to different hydrological states. The models are benchmarked against commonly applied supervised learning approaches, particularly linear regression, and random forests, as well as to a zero-model which closely resembles the current way of classifying bathing water quality in the European Union. For model-based clustering we apply a non-parametric Bayesian approach based on a Dirichlet Process Mixture Model. The study tests and demonstrates the proposed approaches at three river bathing locations in Germany, known to be influenced by short-term pollution events. At each river two modelling experiments (“longest dry period”, “sequential model training”) are performed to explore how the different modelling approaches react and adapt to scarce and uninformative training data, i.e., datasets that do not include event pollution information in terms of elevated FIB concentrations. We demonstrate that it is especially the proposed Bayesian approaches that are able to raise correct warnings in such situations (> 90 % true positive rate). The zero-model and random forest are shown to be unable to predict contamination episodes if pollution episodes are not present in the training data. Our research shows that the investigated Bayesian approaches reduce the risk of missed pollution events, thereby improving bathing water safety management. Additionally, the approaches provide a transparent solution for setting minimum data quality requirements under various conditions. The proposed approaches open the way for developing data-driven models for bathing water quality prediction against the reality that data scarcity is common problem at existing and prospective bathing waters.

Original languageEnglish
Article number121186
Number of pages11
JournalWater Research
Volume252
DOIs
Publication statusPublished - 2024

Keywords

  • Dirichlet Process Mixture Model
  • Probabilistic modelling
  • Recreational waters

Fingerprint

Dive into the research topics of 'A new Bayesian approach for managing bathing water quality at river bathing locations vulnerable to short-term pollution'. Together they form a unique fingerprint.

Cite this