Symbolic Regression Methods for Reinforcement Learning

Jiri Kubalik; Erik Derner; Jan Zegklitz; Robert Babuska

doi:10.1109/ACCESS.2021.3119000

Symbolic Regression Methods for Reinforcement Learning

Jiri Kubalik^*, Erik Derner, Jan Zegklitz, Robert Babuska

^*Corresponding author for this work

Learning & Autonomous Control

Research output: Contribution to journal › Article › Scientific › peer-review

4 Citations (Scopus)

41 Downloads (Pure)

Abstract

Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: They are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: Symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: Velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one.

Original language	English
Pages (from-to)	139697-139711
Journal	IEEE Access
Volume	9
DOIs	https://doi.org/10.1109/ACCESS.2021.3119000
Publication status	Published - 2021

Keywords

genetic programming
nonlinear optimal control
policy iteration
Reinforcement learning
symbolic regression
value iteration

Access to Document

10.1109/ACCESS.2021.3119000

Symbolic_Regression_Methods_for_Reinforcement_LearningFinal published version, 2.62 MBLicence: CC BY

Cite this

@article{024116022f704deb8ede29f98ff8f91b,

title = "Symbolic Regression Methods for Reinforcement Learning",

abstract = "Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: They are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: Symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: Velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one. ",

keywords = "genetic programming, nonlinear optimal control, policy iteration, Reinforcement learning, symbolic regression, value iteration",

author = "Jiri Kubalik and Erik Derner and Jan Zegklitz and Robert Babuska",

year = "2021",

doi = "10.1109/ACCESS.2021.3119000",

language = "English",

volume = "9",

pages = "139697--139711",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "IEEE",

}

TY - JOUR

T1 - Symbolic Regression Methods for Reinforcement Learning

AU - Kubalik, Jiri

AU - Derner, Erik

AU - Zegklitz, Jan

AU - Babuska, Robert

PY - 2021

Y1 - 2021

N2 - Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: They are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: Symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: Velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one.

AB - Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: They are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: Symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: Velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one.

KW - genetic programming

KW - nonlinear optimal control

KW - policy iteration

KW - Reinforcement learning

KW - symbolic regression

KW - value iteration

UR - http://www.scopus.com/inward/record.url?scp=85117079506&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2021.3119000

DO - 10.1109/ACCESS.2021.3119000

M3 - Article

AN - SCOPUS:85117079506

SN - 2169-3536

VL - 9

SP - 139697

EP - 139711

JO - IEEE Access

JF - IEEE Access

ER -

Symbolic Regression Methods for Reinforcement Learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this