Abstract
Recent years have seen a growing interest in the use of deep neural networks as
function approximators in reinforcement learning. This paper investigates the potential of the Deep Deterministic Policy Gradient method for a robot control problem both in simulation and in a real setup. The importance of the size and composition of the experience replay database is investigated and some requirements on the distribution over the state-action space of the experiences in the database are identified. Of particular interest is the importance of negative experiences that are not close to an optimal policy. It is shown how training with samples that are insufficiently spread over the state-action space can cause the method to fail, and how maintaining the distribution over the state-action space of the samples in the experience database can greatly benefit learning.
function approximators in reinforcement learning. This paper investigates the potential of the Deep Deterministic Policy Gradient method for a robot control problem both in simulation and in a real setup. The importance of the size and composition of the experience replay database is investigated and some requirements on the distribution over the state-action space of the experiences in the database are identified. Of particular interest is the importance of negative experiences that are not close to an optimal policy. It is shown how training with samples that are insufficiently spread over the state-action space can cause the method to fail, and how maintaining the distribution over the state-action space of the samples in the experience database can greatly benefit learning.
| Original language | English |
|---|---|
| Title of host publication | Deep Reinforcement Learning Workshop, NIPS 2015 |
| Number of pages | 9 |
| Publication status | Published - 2015 |
| Event | NIPS 2015 : 29th Conference on Neural Information Processing Systems - Montreal, Canada Duration: 7 Dec 2015 → 12 Dec 2015 |
Conference
| Conference | NIPS 2015 : 29th Conference on Neural Information Processing Systems |
|---|---|
| Country/Territory | Canada |
| City | Montreal |
| Period | 7/12/15 → 12/12/15 |
Bibliographical note
Deep Reinforcement Learning Workshop (on Friday December 11th).Fingerprint
Dive into the research topics of 'The importance of experience replay database composition in deep reinforcement learning'. Together they form a unique fingerprint.Research output
- 1 Dissertation (TU Delft)
-
Sample effficient deep reinforcement learning for control
de Bruin, T., 2020, 167 p.Research output: Thesis › Dissertation (TU Delft)
Open AccessFile1160 Downloads (Pure)
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver