A Survey on Scenario Theory, Complexity, and Compression-Based Learning and Generalization

Roberto  Rocchetta; Alexander Mey; Frans Oliehoek

doi:10.1109/TNNLS.2023.3308828

A Survey on Scenario Theory, Complexity, and Compression-Based Learning and Generalization

Roberto Rocchetta, Alexander Mey, Frans Oliehoek

Interactive Intelligence

Research output: Contribution to journal › Article › Scientific › peer-review

66 Downloads (Pure)

Abstract

This work investigates formal generalization error bounds that apply to support vector machines (SVMs) in realizable and agnostic learning problems. We focus on recently observed parallels between probably approximately correct (PAC)-learning bounds, such as compression and complexity-based bounds, and novel error guarantees derived within scenario theory. Scenario theory provides nonasymptotic and distributional-free error bounds for models trained by solving data-driven decision-making problems. Relevant theorems and assumptions are reviewed and discussed. We propose a numerical comparison of the tightness and effectiveness of theoretical error bounds for support vector classifiers trained on several randomized experiments from 13 real-life problems. This analysis allows for a fair comparison of different approaches from both conceptual and experimental standpoints. Based on the numerical results, we argue that the error guarantees derived from scenario theory are often tighter for realizable problems and always yield informative results, i.e., probability bounds tighter than a vacuous <inline-formula> <tex-math notation="LaTeX">$[0,1]$</tex-math> </inline-formula> interval. This work promotes scenario theory as an alternative tool for model selection, structural-risk minimization, and generalization error analysis of SVMs. In this way, we hope to bring the communities of scenario and statistical learning theory closer, so that they can benefit from each other’s insights.

Original language	English
Pages (from-to)	1-15
Number of pages	15
Journal	IEEE Transactions on Neural Networks and Learning Systems
DOIs	https://doi.org/10.1109/TNNLS.2023.3308828
Publication status	E-pub ahead of print - 2023

Keywords

Agnostic learning
compression
generalization theory
probably approximately correct (PAC)
scenario optimization
support vector classifiers

Access to Document

10.1109/TNNLS.2023.3308828

A_Survey_on_Scenario_Theory_Complexity_and_Compression-Based_Learning_and_GeneralizationAccepted author manuscript, 2.18 MB

Cite this

@article{9ef9bc98f6fb416e9111e02e2b659943,

title = "A Survey on Scenario Theory, Complexity, and Compression-Based Learning and Generalization",

abstract = "This work investigates formal generalization error bounds that apply to support vector machines (SVMs) in realizable and agnostic learning problems. We focus on recently observed parallels between probably approximately correct (PAC)-learning bounds, such as compression and complexity-based bounds, and novel error guarantees derived within scenario theory. Scenario theory provides nonasymptotic and distributional-free error bounds for models trained by solving data-driven decision-making problems. Relevant theorems and assumptions are reviewed and discussed. We propose a numerical comparison of the tightness and effectiveness of theoretical error bounds for support vector classifiers trained on several randomized experiments from 13 real-life problems. This analysis allows for a fair comparison of different approaches from both conceptual and experimental standpoints. Based on the numerical results, we argue that the error guarantees derived from scenario theory are often tighter for realizable problems and always yield informative results, i.e., probability bounds tighter than a vacuous $[0,1]$ interval. This work promotes scenario theory as an alternative tool for model selection, structural-risk minimization, and generalization error analysis of SVMs. In this way, we hope to bring the communities of scenario and statistical learning theory closer, so that they can benefit from each other{\textquoteright}s insights.",

keywords = "Agnostic learning, compression, generalization theory, probably approximately correct (PAC), scenario optimization, support vector classifiers",

author = "Roberto Rocchetta and Alexander Mey and Frans Oliehoek",

year = "2023",

doi = "10.1109/TNNLS.2023.3308828",

language = "English",

pages = "1--15",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-2388",

publisher = "IEEE Computational Intelligence Society",

}

TY - JOUR

T1 - A Survey on Scenario Theory, Complexity, and Compression-Based Learning and Generalization

AU - Rocchetta, Roberto

AU - Mey, Alexander

AU - Oliehoek, Frans

PY - 2023

Y1 - 2023

N2 - This work investigates formal generalization error bounds that apply to support vector machines (SVMs) in realizable and agnostic learning problems. We focus on recently observed parallels between probably approximately correct (PAC)-learning bounds, such as compression and complexity-based bounds, and novel error guarantees derived within scenario theory. Scenario theory provides nonasymptotic and distributional-free error bounds for models trained by solving data-driven decision-making problems. Relevant theorems and assumptions are reviewed and discussed. We propose a numerical comparison of the tightness and effectiveness of theoretical error bounds for support vector classifiers trained on several randomized experiments from 13 real-life problems. This analysis allows for a fair comparison of different approaches from both conceptual and experimental standpoints. Based on the numerical results, we argue that the error guarantees derived from scenario theory are often tighter for realizable problems and always yield informative results, i.e., probability bounds tighter than a vacuous $[0,1]$ interval. This work promotes scenario theory as an alternative tool for model selection, structural-risk minimization, and generalization error analysis of SVMs. In this way, we hope to bring the communities of scenario and statistical learning theory closer, so that they can benefit from each other’s insights.

AB - This work investigates formal generalization error bounds that apply to support vector machines (SVMs) in realizable and agnostic learning problems. We focus on recently observed parallels between probably approximately correct (PAC)-learning bounds, such as compression and complexity-based bounds, and novel error guarantees derived within scenario theory. Scenario theory provides nonasymptotic and distributional-free error bounds for models trained by solving data-driven decision-making problems. Relevant theorems and assumptions are reviewed and discussed. We propose a numerical comparison of the tightness and effectiveness of theoretical error bounds for support vector classifiers trained on several randomized experiments from 13 real-life problems. This analysis allows for a fair comparison of different approaches from both conceptual and experimental standpoints. Based on the numerical results, we argue that the error guarantees derived from scenario theory are often tighter for realizable problems and always yield informative results, i.e., probability bounds tighter than a vacuous $[0,1]$ interval. This work promotes scenario theory as an alternative tool for model selection, structural-risk minimization, and generalization error analysis of SVMs. In this way, we hope to bring the communities of scenario and statistical learning theory closer, so that they can benefit from each other’s insights.

KW - Agnostic learning

KW - compression

KW - generalization theory

KW - probably approximately correct (PAC)

KW - scenario optimization

KW - support vector classifiers

UR - http://www.scopus.com/inward/record.url?scp=85171757180&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2023.3308828

DO - 10.1109/TNNLS.2023.3308828

M3 - Article

SN - 2162-2388

SP - 1

EP - 15

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

ER -

A Survey on Scenario Theory, Complexity, and Compression-Based Learning and Generalization

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this