Gradient boosting for extreme quantile regression

Jasper Velthoen; Clément Dombry; Juan Juan Cai; Sebastian Engelke

doi:10.1007/s10687-023-00473-x

Gradient boosting for extreme quantile regression

Jasper Velthoen, Clément Dombry, Juan Juan Cai, Sebastian Engelke^*

^*Corresponding author for this work

Statistics

Research output: Contribution to journal › Article › Scientific › peer-review

5 Citations (Scopus)

27 Downloads (Pure)

Abstract

Extreme quantile regression provides estimates of conditional quantiles outside the range of the data. Classical quantile regression performs poorly in such cases since data in the tail region are too scarce. Extreme value theory is used for extrapolation beyond the range of observed values and estimation of conditional extreme quantiles. Based on the peaks-over-threshold approach, the conditional distribution above a high threshold is approximated by a generalized Pareto distribution with covariate dependent parameters. We propose a gradient boosting procedure to estimate a conditional generalized Pareto distribution by minimizing its deviance. Cross-validation is used for the choice of tuning parameters such as the number of trees and the tree depths. We discuss diagnostic plots such as variable importance and partial dependence plots, which help to interpret the fitted models. In simulation studies we show that our gradient boosting procedure outperforms classical methods from quantile regression and extreme value theory, especially for high-dimensional predictor spaces and complex parameter response surfaces. An application to statistical post-processing of weather forecasts with precipitation data in the Netherlands is proposed.

Original language	English
Pages (from-to)	639-667
Number of pages	29
Journal	Extremes
Volume	26
Issue number	4
DOIs	https://doi.org/10.1007/s10687-023-00473-x
Publication status	Published - 2023

Keywords

60G70
62G08
Extreme quantile regression
Extreme value theory
Generalized Pareto distribution
Gradient boosting
Tree-based methods

Access to Document

10.1007/s10687-023-00473-x

s10687-023-00473-xFinal published version, 4.93 MBLicence: CC BY

Cite this

@article{5834a75db3ab41208153b31188b7fd34,

title = "Gradient boosting for extreme quantile regression",

abstract = "Extreme quantile regression provides estimates of conditional quantiles outside the range of the data. Classical quantile regression performs poorly in such cases since data in the tail region are too scarce. Extreme value theory is used for extrapolation beyond the range of observed values and estimation of conditional extreme quantiles. Based on the peaks-over-threshold approach, the conditional distribution above a high threshold is approximated by a generalized Pareto distribution with covariate dependent parameters. We propose a gradient boosting procedure to estimate a conditional generalized Pareto distribution by minimizing its deviance. Cross-validation is used for the choice of tuning parameters such as the number of trees and the tree depths. We discuss diagnostic plots such as variable importance and partial dependence plots, which help to interpret the fitted models. In simulation studies we show that our gradient boosting procedure outperforms classical methods from quantile regression and extreme value theory, especially for high-dimensional predictor spaces and complex parameter response surfaces. An application to statistical post-processing of weather forecasts with precipitation data in the Netherlands is proposed.",

keywords = "60G70, 62G08, Extreme quantile regression, Extreme value theory, Generalized Pareto distribution, Gradient boosting, Tree-based methods",

author = "Jasper Velthoen and Cl{\'e}ment Dombry and Cai, {Juan Juan} and Sebastian Engelke",

year = "2023",

doi = "10.1007/s10687-023-00473-x",

language = "English",

volume = "26",

pages = "639--667",

journal = "Extremes",

issn = "1386-1999",

publisher = "Springer",

number = "4",

}

TY - JOUR

T1 - Gradient boosting for extreme quantile regression

AU - Velthoen, Jasper

AU - Dombry, Clément

AU - Cai, Juan Juan

AU - Engelke, Sebastian

PY - 2023

Y1 - 2023

N2 - Extreme quantile regression provides estimates of conditional quantiles outside the range of the data. Classical quantile regression performs poorly in such cases since data in the tail region are too scarce. Extreme value theory is used for extrapolation beyond the range of observed values and estimation of conditional extreme quantiles. Based on the peaks-over-threshold approach, the conditional distribution above a high threshold is approximated by a generalized Pareto distribution with covariate dependent parameters. We propose a gradient boosting procedure to estimate a conditional generalized Pareto distribution by minimizing its deviance. Cross-validation is used for the choice of tuning parameters such as the number of trees and the tree depths. We discuss diagnostic plots such as variable importance and partial dependence plots, which help to interpret the fitted models. In simulation studies we show that our gradient boosting procedure outperforms classical methods from quantile regression and extreme value theory, especially for high-dimensional predictor spaces and complex parameter response surfaces. An application to statistical post-processing of weather forecasts with precipitation data in the Netherlands is proposed.

AB - Extreme quantile regression provides estimates of conditional quantiles outside the range of the data. Classical quantile regression performs poorly in such cases since data in the tail region are too scarce. Extreme value theory is used for extrapolation beyond the range of observed values and estimation of conditional extreme quantiles. Based on the peaks-over-threshold approach, the conditional distribution above a high threshold is approximated by a generalized Pareto distribution with covariate dependent parameters. We propose a gradient boosting procedure to estimate a conditional generalized Pareto distribution by minimizing its deviance. Cross-validation is used for the choice of tuning parameters such as the number of trees and the tree depths. We discuss diagnostic plots such as variable importance and partial dependence plots, which help to interpret the fitted models. In simulation studies we show that our gradient boosting procedure outperforms classical methods from quantile regression and extreme value theory, especially for high-dimensional predictor spaces and complex parameter response surfaces. An application to statistical post-processing of weather forecasts with precipitation data in the Netherlands is proposed.

KW - 60G70

KW - 62G08

KW - Extreme quantile regression

KW - Extreme value theory

KW - Generalized Pareto distribution

KW - Gradient boosting

KW - Tree-based methods

UR - http://www.scopus.com/inward/record.url?scp=85165257099&partnerID=8YFLogxK

U2 - 10.1007/s10687-023-00473-x

DO - 10.1007/s10687-023-00473-x

M3 - Article

AN - SCOPUS:85165257099

SN - 1386-1999

VL - 26

SP - 639

EP - 667

JO - Extremes

JF - Extremes

IS - 4

ER -

Gradient boosting for extreme quantile regression

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this