Deep residual reinforcement learning

Shangtong Zhang*, Wendelin Boehmer, Shimon Whiteson

*Corresponding author for this work

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

9 Citations (Scopus)

Abstract

We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperforms vanilla DDPG in the DeepMind Control Suite benchmark. Moreover, we find the residual algorithm an effective approach to the distribution mismatch problem in model-based planning. Compared with the existing TD(k) method, our residual-based method makes weaker assumptions about the model and yields a greater performance boost.

Original languageEnglish
Title of host publicationProceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020
EditorsBo An, Amal El Fallah Seghrouchni, Gita Sukthankar
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages1611-1619
Number of pages9
ISBN (Electronic)9781450375184
Publication statusPublished - 2020
Externally publishedYes
Event19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020 - Virtual, Auckland, New Zealand
Duration: 19 May 2020 → …

Publication series

NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume2020-May
ISSN (Print)1548-8403
ISSN (Electronic)1558-2914

Conference

Conference19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2020
Country/TerritoryNew Zealand
CityVirtual, Auckland
Period19/05/20 → …

Keywords

  • Reinforcement learning
  • Residual algorithms

Fingerprint

Dive into the research topics of 'Deep residual reinforcement learning'. Together they form a unique fingerprint.

Cite this