Improved deep reinforcement learning for robotics through distribution-based experience retention

Tim de Bruin, Jens Kober, Karl Tuyls, Robert Babuska

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

14 Citations (Scopus)
43 Downloads (Pure)

Abstract

Recent years have seen a growing interest in the use of deep neural networks as function approximators in reinforcement learning. In this paper, an experience replay method is proposed that ensures that the distribution of the experiences used for training is between that of the policy and a uniform distribution. Through experiments on a magnetic manipulation task it is shown that the method reduces the need for sustained exhaustive exploration during learning. This makes it attractive in scenarios where sustained exploration is in-feasible or undesirable, such as for physical systems like robots and for life long learning. The method is also shown to improve the generalization performance of the trained policy, which can make it attractive for transfer learning. Finally, for small experience databases the method performs favorably when compared to the recently proposed alternative of using the temporal difference error to determine the experience sample distribution, which makes it an attractive option for robots with limited memory capacity.
Original languageEnglish
Title of host publicationProceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Subtitle of host publicationIROS 2016
EditorsDong-Soo Kwon, Chul-Goo Kang, Il Hong Suh
Place of PublicationPiscataway, NJ, USA
PublisherIEEE
Pages3947-3952
ISBN (Print)978-1-5090-3762-9
DOIs
Publication statusPublished - 2016
Event2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016 - Daejeon, Korea, Republic of
Duration: 9 Oct 201614 Oct 2016
http://www.iros2016.org/

Conference

Conference2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016
Abbreviated titleIROS 2016
CountryKorea, Republic of
CityDaejeon
Period9/10/1614/10/16
Internet address

Keywords

  • Databases
  • Neural networks
  • Training
  • Learning (artificial intelligence)
  • Standards
  • Robot control

Fingerprint Dive into the research topics of 'Improved deep reinforcement learning for robotics through distribution-based experience retention'. Together they form a unique fingerprint.

Cite this