Bayesian Model-Free Deep Reinforcement Learning

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

41 Downloads (Pure)

Abstract

Exploration in reinforcement learning remains a difficult challenge. In order to drive exploration, ensembles with randomized prior functions have recently been popularized to quantify uncertainty in the value model. However these ensembles have no theoretical reason to resemble the actual Bayesian posterior, which is known to provide strong performance in theory under certain conditions. In this thesis work, we view training ensembles from the perspective of Sequential Monte Carlo, a Monte Carlo method that approximates a sequence of distributions with a set of particles, and propose an algorithm that exploits both the practical flexibility of ensembles and theory of the Bayesian paradigm. We incorporate this method into a standard DQN agent and experimentally show qualitatively good uncertainty quantification and improved exploration capabilities over a regular ensemble. In the future, we will investigate the impact of likelihood and prior choices in Bayesian model-free reinforcement learning methods.
Original languageEnglish
Title of host publicationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems
EditorsNatasha Alechina, Virginia Dignum
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages2782-1784
ISBN (Electronic)979-8-4007-0486-4
Publication statusPublished - 2024
Event23rd International Conference on Autonomous Agents and Multiagent Systems - Auckland, New Zealand
Duration: 6 May 202410 May 2024
Conference number: 23

Conference

Conference23rd International Conference on Autonomous Agents and Multiagent Systems
Abbreviated titleAAMAS '24
Country/TerritoryNew Zealand
CityAuckland
Period6/05/2410/05/24

Fingerprint

Dive into the research topics of 'Bayesian Model-Free Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this