Data-driven construction of symbolic process models for reinforcement learning

Erik Derner, Jiří Kubalík, Robert Babuska

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

6 Citations (Scopus)

Abstract

Reinforcement learning (RL) is a suitable approach for controlling systems with unknown or time-varying dynamics. RL in principle does not require a model of the system, but before it learns an acceptable policy, it needs many unsuccessful trials, which real robots usually cannot withstand. It is well known that RL can be sped up and made safer by using models learned online. In this paper, we propose to use symbolic regression to construct compact, parsimonious models described by analytic equations, which are suitable for realtime robot control. Single node genetic programming (SNGP) is employed as a tool to automatically search for equations fitting the available data. We demonstrate the approach on two benchmark examples: a simulated mobile robot and the pendulum swing-up problem; the latter both in simulations and real-time experiments. The results show that through this approach we can find accurate models even for small batches of training data. Based on the symbolic model found, RL can control the system well
Original languageEnglish
Title of host publicationProceedings of the IEEE International Conference on Robotics and Automation (ICRA 2018)
EditorsKevin Lynch
Place of PublicationPiscataway, NJ, USA
PublisherIEEE
Pages5105-5112
ISBN (Electronic)978-1-5386-3081-5
DOIs
Publication statusPublished - 2018
EventICRA 2018: 2018 IEEE International Conference on Robotics and Automation - Brisbane Convention & Exhibition Centre, Brisbane, Australia
Duration: 21 May 201825 May 2018

Conference

ConferenceICRA 2018: 2018 IEEE International Conference on Robotics and Automation
Country/TerritoryAustralia
CityBrisbane
Period21/05/1825/05/18

Keywords

  • Model learning for control
  • AI-based methods
  • symbolic regression
  • reinforcement learning
  • optimal control

Fingerprint

Dive into the research topics of 'Data-driven construction of symbolic process models for reinforcement learning'. Together they form a unique fingerprint.

Cite this