Simultaneous learning of objective function and policy from interactive teaching with corrective feedback

Carlos Celemin, Jens Kober

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

91 Downloads (Pure)

Abstract

Some imitation learning approaches rely on Inverse Reinforcement Learning (IRL) methods, to decode and generalize implicit goals given by expert demonstrations. The study of IRL normally has the assumption of available expert demonstrations, which is not always possible. There are Machine Learning methods that allow non-expert teachers to guide robots to learn complex policies, which eventually fills the expert dependencies of IRL. This work introduces an approach for simultaneously teaching robot policies and objective functions from vague human corrective feedback. The main goal is to generalize the insights that a non-expert human teacher provides to the robot, to unseen conditions, without further need for human effort in the complementary training process. We present an experimental validation of the introduced approach for transfer learning of knowledge to scenarios not considered while the non-expert was teaching. Experimental results show that the learned reward functions obtain similar performance in RL processes compared to engineered reward functions used as baseline, both in simulated and real environments.

Original languageEnglish
Title of host publicationProceedings of the 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM 2019
Place of PublicationPiscataway, NJ, USA
PublisherIEEE
Pages726-732
ISBN (Electronic)978-1-7281-2493-3
DOIs
Publication statusPublished - 2019
Event2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM 2019 - Hong Kong, China
Duration: 8 Jul 201912 Jul 2019

Conference

Conference2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, AIM 2019
Country/TerritoryChina
CityHong Kong
Period8/07/1912/07/19

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care

Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Fingerprint

Dive into the research topics of 'Simultaneous learning of objective function and policy from interactive teaching with corrective feedback'. Together they form a unique fingerprint.

Cite this