Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

Rodrigo Perez Dattari; Carlos Celemin; Javier Ruiz-del-Solar; Jens Kober

doi:10.1007/978-3-030-33950-0_31

Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

Rodrigo Perez Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober

Learning & Autonomous Control

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

2 Citations (Scopus)

20 Downloads (Pure)

Abstract

Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent’s actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

Original language	English
Title of host publication	Proceedings of the 2018 International Symposium on Experimental Robotics
Editors	Jing Xiao, Torsten Kröger, Oussama Khatib
Place of Publication	Cham, Switzerland
Publisher	Springer
Chapter	Robot Learning II
Pages	353-363
ISBN (Electronic)	978-3-030-33950-0
ISBN (Print)	978-3-030-33949-4
DOIs	https://doi.org/10.1007/978-3-030-33950-0_31
Publication status	Published - 2020
Event	ISER 2018: International Symposium on Experimental Robotics - Buenos Aires, Argentina Duration: 5 Nov 2018 → 8 Nov 2018

Publication series

Name	Springer Proceedings in Advanced Robotics
Publisher	Springer
Volume	11

Conference

Conference	ISER 2018: International Symposium on Experimental Robotics
Country/Territory	Argentina
City	Buenos Aires
Period	5/11/18 → 8/11/18

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Reinforcement Learning
Deep Learning
Interactive Machine Learning
Learning from Demonstration

Access to Document

10.1007/978-3-030-33950-0_31

outFinal published version, 972 KB

Cite this

Perez Dattari, R., Celemin, C., Ruiz-del-Solar, J., & Kober, J. (2020). Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks. In J. Xiao, T. Kröger, & O. Khatib (Eds.), Proceedings of the 2018 International Symposium on Experimental Robotics (pp. 353-363). (Springer Proceedings in Advanced Robotics; Vol. 11). Springer. https://doi.org/10.1007/978-3-030-33950-0_31

Perez Dattari, Rodrigo ; Celemin, Carlos ; Ruiz-del-Solar, Javier et al. / Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks. Proceedings of the 2018 International Symposium on Experimental Robotics. editor / Jing Xiao ; Torsten Kröger ; Oussama Khatib. Cham, Switzerland : Springer, 2020. pp. 353-363 (Springer Proceedings in Advanced Robotics).

@inproceedings{281fb83a4f114b57a8af8f930ced28ea,

title = "Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks",

abstract = "Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent{\textquoteright}s actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.",

keywords = "Reinforcement Learning, Deep Learning, Interactive Machine Learning, Learning from Demonstration",

author = "{Perez Dattari}, Rodrigo and Carlos Celemin and Javier Ruiz-del-Solar and Jens Kober",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.; ISER 2018: International Symposium on Experimental Robotics ; Conference date: 05-11-2018 Through 08-11-2018",

year = "2020",

doi = "10.1007/978-3-030-33950-0_31",

language = "English",

isbn = "978-3-030-33949-4",

series = "Springer Proceedings in Advanced Robotics",

publisher = "Springer",

pages = "353--363",

editor = "Jing Xiao and Torsten Kr{\"o}ger and Oussama Khatib",

booktitle = "Proceedings of the 2018 International Symposium on Experimental Robotics",

}

Perez Dattari, R, Celemin, C, Ruiz-del-Solar, J & Kober, J 2020, Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks. in J Xiao, T Kröger & O Khatib (eds), Proceedings of the 2018 International Symposium on Experimental Robotics. Springer Proceedings in Advanced Robotics, vol. 11, Springer, Cham, Switzerland, pp. 353-363, ISER 2018: International Symposium on Experimental Robotics, Buenos Aires, Argentina, 5/11/18. https://doi.org/10.1007/978-3-030-33950-0_31

Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks. / Perez Dattari, Rodrigo; Celemin, Carlos; Ruiz-del-Solar, Javier et al.
Proceedings of the 2018 International Symposium on Experimental Robotics. ed. / Jing Xiao; Torsten Kröger; Oussama Khatib. Cham, Switzerland: Springer, 2020. p. 353-363 (Springer Proceedings in Advanced Robotics; Vol. 11).

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

AU - Perez Dattari, Rodrigo

AU - Celemin, Carlos

AU - Ruiz-del-Solar, Javier

AU - Kober, Jens

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2020

Y1 - 2020

N2 - Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent’s actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

AB - Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent’s actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

KW - Reinforcement Learning

KW - Deep Learning

KW - Interactive Machine Learning

KW - Learning from Demonstration

UR - https://arxiv.org/abs/1810.00466

UR - http://www.scopus.com/inward/record.url?scp=85107075873&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-33950-0_31

DO - 10.1007/978-3-030-33950-0_31

M3 - Conference contribution

SN - 978-3-030-33949-4

T3 - Springer Proceedings in Advanced Robotics

SP - 353

EP - 363

BT - Proceedings of the 2018 International Symposium on Experimental Robotics

A2 - Xiao, Jing

A2 - Kröger, Torsten

A2 - Khatib, Oussama

PB - Springer

CY - Cham, Switzerland

T2 - ISER 2018: International Symposium on Experimental Robotics

Y2 - 5 November 2018 through 8 November 2018

ER -

Perez Dattari R, Celemin C, Ruiz-del-Solar J, Kober J. Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks. In Xiao J, Kröger T, Khatib O, editors, Proceedings of the 2018 International Symposium on Experimental Robotics. Cham, Switzerland: Springer. 2020. p. 353-363. (Springer Proceedings in Advanced Robotics). doi: 10.1007/978-3-030-33950-0_31

Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this