Abstract
Exploration in reinforcement learning remains a difficult challenge. In order to drive exploration, ensembles with randomized prior functions have recently been popularized to quantify uncertainty in the value model. There is no theoretical reason for these ensembles to resemble the actual posterior, however. In this work, we view training ensembles from the perspective of Sequential Monte Carlo, a Monte Carlo method that approximates a sequence of distributions with a set of particles. In particular, we propose an algorithm that exploits both the practical flexibility of ensembles and theory of the Bayesian paradigm. We incorporate this method into a standard Deep Q-learning agent (DQN) and experimentally show qualitatively good uncertainty quantification and improved exploration capabilities over a regular ensemble.
Original language | English |
---|---|
Title of host publication | Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems |
Editors | Natasha Alechina, Virginia Dignum |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) |
Pages | 2528-2530 |
ISBN (Electronic) | 9798400704864 |
Publication status | Published - 2024 |
Event | 23rd International Conference on Autonomous Agents and Multiagent Systems - Auckland, New Zealand Duration: 6 May 2024 → 10 May 2024 Conference number: 23 |
Conference
Conference | 23rd International Conference on Autonomous Agents and Multiagent Systems |
---|---|
Abbreviated title | AAMAS '24 |
Country/Territory | New Zealand |
City | Auckland |
Period | 6/05/24 → 10/05/24 |