Smooth Exploration for Robotic Reinforcement Learning

Antonin Raffin; Jens Kober; Freek Stulp

Smooth Exploration for Robotic Reinforcement Learning

Antonin Raffin, Jens Kober, Freek Stulp

Learning & Autonomous Control

Research output: Contribution to journal › Conference article › Scientific › peer-review

3 Downloads (Pure)

Abstract

Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) [1] to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.

Original language	English
Pages (from-to)	1634-1644
Number of pages	11
Journal	Proceedings of Machine Learning Research
Volume	164
Publication status	Published - 2021
Event	5th Conference on Robot Learning, CoRL 2021 - London, United Kingdom Duration: 8 Nov 2021 → 11 Nov 2021

Access to Document

raffin22aFinal published version, 1.15 MB

Cite this

@article{31602078fb5b4cef9dc2ddfd06eb5add,

title = "Smooth Exploration for Robotic Reinforcement Learning",

abstract = "Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) [1] to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.",

author = "Antonin Raffin and Jens Kober and Freek Stulp",

year = "2021",

language = "English",

volume = "164",

pages = "1634--1644",

journal = "Proceedings of Machine Learning Research",

issn = "1938-7228",

note = "5th Conference on Robot Learning, CoRL 2021 ; Conference date: 08-11-2021 Through 11-11-2021",

}

TY - JOUR

T1 - Smooth Exploration for Robotic Reinforcement Learning

AU - Raffin, Antonin

AU - Kober, Jens

AU - Stulp, Freek

PY - 2021

Y1 - 2021

N2 - Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) [1] to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.

AB - Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL – often very successful in simulation – leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) [1] to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE enables a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance.

UR - http://www.scopus.com/inward/record.url?scp=85171780594&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85171780594

SN - 1938-7228

VL - 164

SP - 1634

EP - 1644

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 5th Conference on Robot Learning, CoRL 2021

Y2 - 8 November 2021 through 11 November 2021

ER -

Smooth Exploration for Robotic Reinforcement Learning

Abstract

Access to Document

Other files and links

Fingerprint

Cite this