Off-policy experience retention for deep actor-critic learning

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

Abstract

When a limited number of experiences is kept in memory to train a reinforcement learning agent, the criterion that determines which experiences are retained can have a strong impact on the learning performance. In this paper, we argue that for actor critic learning in domains with significant momentum, it is important to retain experiences with off-policy actions when the amount of exploration is reduced over time. This claim is supported by simulation experiments with a pendulum swing-up problem and a magnetic manipulation task. Additionally, we compare our strategy to database overwriting policies based on obtaining experiences spread out over the state-action space, and also to using the temporal difference error as a proxy for the value of experiences.
Original languageEnglish
Title of host publicationDeep Reinforcement Learning Workshop, NIPS 2016 - December 9, 2016
Number of pages9
Publication statusPublished - 2016
EventNIPS 2016: 30th Conference on Neural Information Processing Systems - Centre Convencions Internacional Barcelona, Barcelona, Spain
Duration: 5 Dec 201610 Dec 2016
https://nips.cc/Conferences/2016

Conference

ConferenceNIPS 2016: 30th Conference on Neural Information Processing Systems
Abbreviated titleNIPS
CountrySpain
CityBarcelona
Period5/12/1610/12/16
Internet address

Fingerprint Dive into the research topics of 'Off-policy experience retention for deep actor-critic learning'. Together they form a unique fingerprint.

  • Research Output

    • 1 Dissertation (TU Delft)

    Sample effficient deep reinforcement learning for control

    de Bruin, T., 2020, 167 p.

    Research output: ThesisDissertation (TU Delft)

    Open Access
    File
  • 249 Downloads (Pure)

    Cite this

    de Bruin, T., Kober, J., Tuyls, K. P., & Babuska, R. (2016). Off-policy experience retention for deep actor-critic learning. In Deep Reinforcement Learning Workshop, NIPS 2016 - December 9, 2016