Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

Rodrigo Perez Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

2 Citations (Scopus)
17 Downloads (Pure)

Abstract

Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent’s actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.
Original languageEnglish
Title of host publicationProceedings of the 2018 International Symposium on Experimental Robotics
EditorsJing Xiao, Torsten Kröger, Oussama Khatib
Place of PublicationCham, Switzerland
PublisherSpringer
ChapterRobot Learning II
Pages353-363
ISBN (Electronic)978-3-030-33950-0
ISBN (Print)978-3-030-33949-4
DOIs
Publication statusPublished - 2020
EventISER 2018: International Symposium on Experimental Robotics - Buenos Aires, Argentina
Duration: 5 Nov 20188 Nov 2018

Publication series

NameSpringer Proceedings in Advanced Robotics
PublisherSpringer
Volume11

Conference

ConferenceISER 2018: International Symposium on Experimental Robotics
Country/TerritoryArgentina
CityBuenos Aires
Period5/11/188/11/18

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • Reinforcement Learning
  • Deep Learning
  • Interactive Machine Learning
  • Learning from Demonstration

Fingerprint

Dive into the research topics of 'Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks'. Together they form a unique fingerprint.

Cite this