Abstract
Reinforcement learning (RL), like any on-line learning method, inevitably faces the exploration-exploitation dilemma. When a learning algorithm requires as few data samples as possible, it is called sample efficient. The design of sample-efficient algorithms is an important area of research. Interestingly, all currently known provably efficient model-free RL algorithms utilize the same well-known principle of optimism in the face of uncertainty. We unite these existing algorithms into a single general model-free optimistic RL framework. We show how this facilitates the design of new optimistic model-free RL algorithms by simplifying the analysis of their efficiency. Finally, we propose one such new algorithm and demonstrate its performance in an experimental study.
Original language | English |
---|---|
Title of host publication | Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020 |
Editors | Bo An, Amal El Fallah Seghrouchni, Gita Sukthankar |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS) |
Pages | 913-921 |
Number of pages | 9 |
ISBN (Electronic) | 978-1-4503-7518-4 |
Publication status | Published - May 2020 |
Event | AAMAS 2020: The 19th International Conference on Autonomous Agents and Multi-Agent Systems - Auckland, New Zealand Duration: 9 May 2020 → 13 May 2020 Conference number: 19th https://aamas2020.conference.auckland.ac.nz |
Publication series
Name | Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS |
---|---|
Volume | 2020-May |
ISSN (Print) | 1548-8403 |
ISSN (Electronic) | 1558-2914 |
Conference
Conference | AAMAS 2020 |
---|---|
Country/Territory | New Zealand |
City | Auckland |
Period | 9/05/20 → 13/05/20 |
Other | Virtual/online event due to COVID-19 |
Internet address |
Bibliographical note
Virtual/online event due to COVID-19Keywords
- Model-free learning
- Reinforcement learning
- Sample efficiency