Description
Reinforcement learning (RL), like any on-line learning method, inevitably faces the exploration-exploitation dilemma. When a learning algorithm requires as few data samples as possible, it is called sample efficient. The design of sample-efficient algorithms is an important area of research. Interestingly, all currently known provably efficient model-free RL algorithms utilize the same well-known principle of optimism in the face of uncertainty. We unite these existing algorithms into a single general model-free optimistic RL framework. We show how this facilitates the design of new optimistic model-free RL algorithms by simplifying the analysis of their efficiency. Finally, we propose one such new algorithm and demonstrate its performance in an experimental study.Period | 11 May 2020 |
---|---|
Event title | AAMAS 2020: The 19th International Conference on Autonomous Agents and Multi-Agent Systems |
Event type | Conference |
Conference number | 19th |
Location | Auckland, New ZealandShow on map |
Degree of Recognition | International |