Human demonstrations for fast and safe exploration in reinforcement learning

Gerben Schonebaum; Jaime Junell; Erik Jan van Kampen

doi:10.2514/6.2017-1069

Human demonstrations for fast and safe exploration in reinforcement learning

Gerben Schonebaum, Jaime Junell, Erik Jan van Kampen

Control & Simulation

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

47 Downloads (Pure)

Abstract

Reinforcement learning is a promising framework for controlling complex vehicles with a high level of autonomy, since it does not need a dynamic model of the vehicle, and it is able to adapt to changing conditions. When learning from scratch, the performance of a reinforcement learning controller may initially be poor and -for real life applications- unsafe. In this paper the effects of using human demonstrations on the performance of reinforcement learning is investigated, using a combination of offline and online least squares policy iteration. It is found that using the human as an efficient explorer improves learning time and performance for a benchmark reinforcement learning problem. The benefit of the human demonstration is larger for problems where the human can make use of its understanding of the problem to efficiently explore the state space. Applied to a simplified quadrotor slung load drop off problem, the use of human demonstrations reduces the number of crashes during learning. As such, this paper contributes to safer and faster learning for model-free, adaptive control problems.

Original language	English
Title of host publication	AIAA Information Systems-AIAA Infotech at Aerospace, 2017
Publisher	American Institute of Aeronautics and Astronautics Inc. (AIAA)
Number of pages	18
ISBN (Electronic)	9781624104497
DOIs	https://doi.org/10.2514/6.2017-1069
Publication status	Published - 2017
Event	AIAA Information Systems-Infotech At Aerospace Conference, 2017 - Grapevine, United States Duration: 9 Jan 2017 → 13 Jan 2017

Conference

Conference	AIAA Information Systems-Infotech At Aerospace Conference, 2017
Country/Territory	United States
City	Grapevine
Period	9/01/17 → 13/01/17

Access to Document

10.2514/6.2017-1069

paperGerbenSchonebaumAccepted author manuscript, 3.19 MB

Cite this

@inproceedings{e860e94cea454793a2f3e84fde614bd1,

title = "Human demonstrations for fast and safe exploration in reinforcement learning",

abstract = "Reinforcement learning is a promising framework for controlling complex vehicles with a high level of autonomy, since it does not need a dynamic model of the vehicle, and it is able to adapt to changing conditions. When learning from scratch, the performance of a reinforcement learning controller may initially be poor and -for real life applications- unsafe. In this paper the effects of using human demonstrations on the performance of reinforcement learning is investigated, using a combination of offline and online least squares policy iteration. It is found that using the human as an efficient explorer improves learning time and performance for a benchmark reinforcement learning problem. The benefit of the human demonstration is larger for problems where the human can make use of its understanding of the problem to efficiently explore the state space. Applied to a simplified quadrotor slung load drop off problem, the use of human demonstrations reduces the number of crashes during learning. As such, this paper contributes to safer and faster learning for model-free, adaptive control problems.",

author = "Gerben Schonebaum and Jaime Junell and {van Kampen}, {Erik Jan}",

year = "2017",

doi = "10.2514/6.2017-1069",

language = "English",

booktitle = "AIAA Information Systems-AIAA Infotech at Aerospace, 2017",

publisher = "American Institute of Aeronautics and Astronautics Inc. (AIAA)",

address = "United States",

note = "AIAA Information Systems-Infotech At Aerospace Conference, 2017 ; Conference date: 09-01-2017 Through 13-01-2017",

}

Schonebaum, G, Junell, J & van Kampen, EJ 2017, Human demonstrations for fast and safe exploration in reinforcement learning. in AIAA Information Systems-AIAA Infotech at Aerospace, 2017., AIAA 2017-1069, American Institute of Aeronautics and Astronautics Inc. (AIAA), AIAA Information Systems-Infotech At Aerospace Conference, 2017, Grapevine, United States, 9/01/17. https://doi.org/10.2514/6.2017-1069

Human demonstrations for fast and safe exploration in reinforcement learning. / Schonebaum, Gerben; Junell, Jaime; van Kampen, Erik Jan.
AIAA Information Systems-AIAA Infotech at Aerospace, 2017. American Institute of Aeronautics and Astronautics Inc. (AIAA), 2017. AIAA 2017-1069.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - Human demonstrations for fast and safe exploration in reinforcement learning

AU - Schonebaum, Gerben

AU - Junell, Jaime

AU - van Kampen, Erik Jan

PY - 2017

Y1 - 2017

N2 - Reinforcement learning is a promising framework for controlling complex vehicles with a high level of autonomy, since it does not need a dynamic model of the vehicle, and it is able to adapt to changing conditions. When learning from scratch, the performance of a reinforcement learning controller may initially be poor and -for real life applications- unsafe. In this paper the effects of using human demonstrations on the performance of reinforcement learning is investigated, using a combination of offline and online least squares policy iteration. It is found that using the human as an efficient explorer improves learning time and performance for a benchmark reinforcement learning problem. The benefit of the human demonstration is larger for problems where the human can make use of its understanding of the problem to efficiently explore the state space. Applied to a simplified quadrotor slung load drop off problem, the use of human demonstrations reduces the number of crashes during learning. As such, this paper contributes to safer and faster learning for model-free, adaptive control problems.

AB - Reinforcement learning is a promising framework for controlling complex vehicles with a high level of autonomy, since it does not need a dynamic model of the vehicle, and it is able to adapt to changing conditions. When learning from scratch, the performance of a reinforcement learning controller may initially be poor and -for real life applications- unsafe. In this paper the effects of using human demonstrations on the performance of reinforcement learning is investigated, using a combination of offline and online least squares policy iteration. It is found that using the human as an efficient explorer improves learning time and performance for a benchmark reinforcement learning problem. The benefit of the human demonstration is larger for problems where the human can make use of its understanding of the problem to efficiently explore the state space. Applied to a simplified quadrotor slung load drop off problem, the use of human demonstrations reduces the number of crashes during learning. As such, this paper contributes to safer and faster learning for model-free, adaptive control problems.

UR - http://www.scopus.com/inward/record.url?scp=85017460679&partnerID=8YFLogxK

UR - http://resolver.tudelft.nl/uuid:e860e94c-ea45-4793-a2f3-e84fde614bd1

U2 - 10.2514/6.2017-1069

DO - 10.2514/6.2017-1069

M3 - Conference contribution

BT - AIAA Information Systems-AIAA Infotech at Aerospace, 2017

PB - American Institute of Aeronautics and Astronautics Inc. (AIAA)

T2 - AIAA Information Systems-Infotech At Aerospace Conference, 2017

Y2 - 9 January 2017 through 13 January 2017

ER -

Human demonstrations for fast and safe exploration in reinforcement learning

Abstract

Conference

Access to Document

Other files and links

Fingerprint

Cite this