Epistemic Bellman Operators

Research output: Contribution to journalConference articleScientificpeer-review

Abstract

Uncertainty quantification remains a difficult challenge in reinforcement learning. Several algorithms exist that successfully quantify uncertainty in a practical setting. However it is unclear whether these algorithms are theoretically sound and can be expected to converge. Furthermore, they seem to treat the uncertainty in the target parameters in different ways. In this work, we unify several practical algorithms into one theoretical framework by defining a new Bellman operator on distributions, and show that this Bellman operator is a contraction. We highlight use cases of our framework by analyzing an existing Bayesian Q-learning algorithm, and also introduce a novel uncertainty-aware variant of PPO that adaptively sets its clipping hyperparameter.
Original languageEnglish
Pages (from-to)20973-20981
Number of pages9
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume39
Issue number20
DOIs
Publication statusPublished - 2025
Event39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States
Duration: 25 Feb 20254 Mar 2025
https://aaai.org/conference/aaai/aaai-25/

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Fingerprint

Dive into the research topics of 'Epistemic Bellman Operators'. Together they form a unique fingerprint.

Cite this