Learning state representation for deep actor-critic control

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

15 Citations (Scopus)
193 Downloads (Pure)

Abstract

Deep Neural Networks (DNNs) can be used as function approximators in Reinforcement Learning (RL). One advantage of DNNs is that they can cope with large input dimensions. Instead of relying on feature engineering to lower the input dimension, DNNs can extract the features from raw observations. The drawback of this end-to-end learning is that it usually requires a large amount of data, which for real-world control applications is not always available. In this paper, a new algorithm, Model Learning Deep Deterministic Policy Gradient (ML-DDPG), is proposed that combines RL with state representation learning, i.e., learning a mapping from an input vector to a state before solving the RL task. The ML-DDPG algorithm uses a concept we call predictive priors to learn a model network which is subsequently used to pre-train the first layer of the actor and critic networks. Simulation results show that the ML-DDPG can learn reasonable continuous control policies from high-dimensional observations that contain also task-irrelevant information. Furthermore, in some cases, this approach significantly improves the final performance in comparison to end-to-end learning.
Original languageEnglish
Title of host publicationProceedings 2016 IEEE 55th Conference on Decision and Control (CDC)
EditorsFrancesco Bullo, Christophe Prieur, Alessandro Giua
Place of PublicationPiscataway, NJ, USA
PublisherIEEE
Pages4667-4673
ISBN (Print)978-1-5090-1837-6
DOIs
Publication statusPublished - 2016
Event55th IEEE Conference on Decision and Control, CDC 2016 - Las Vegas, United States
Duration: 12 Dec 201614 Dec 2016

Conference

Conference55th IEEE Conference on Decision and Control, CDC 2016
Abbreviated titleCDC 2016
CountryUnited States
CityLas Vegas
Period12/12/1614/12/16

Keywords

  • Approximation algorithms
  • Robot sensing systems
  • Algorithm design and analysis
  • Prediction algorithms
  • Learning (artificial intelligence)
  • Feature extraction

Fingerprint Dive into the research topics of 'Learning state representation for deep actor-critic control'. Together they form a unique fingerprint.

Cite this