Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee

Minghao Han; Yuan Tian; Lixian Zhang; Jun Wang; Wei Pan

doi:10.1016/j.automatica.2021.109689

Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee

Minghao Han, Yuan Tian, Lixian Zhang, Jun Wang, Wei Pan^*

^*Corresponding author for this work

Robot Dynamics

Research output: Contribution to journal › Article › Scientific › peer-review

22 Citations (Scopus)

87 Downloads (Pure)

Abstract

Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. Without using a mathematical model, an optimal controller can be learned from data evaluated by certain performance criteria through trial-and-error. However, the data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system. In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data without using a mathematical model. It is further shown how RL with UUB guarantee can be applied to control dynamic systems with safety constraints. Based on the theoretical results, both off-policy and on-policy learning algorithms are proposed respectively. As a result, optimal controllers can be learned to guarantee UUB of the closed-loop system both at convergence and during learning. The proposed algorithms are evaluated on a series of robotic continuous control tasks with safety constraints. In comparison with the existing RL algorithms, the proposed method can achieve superior performance in terms of maintaining safety. As a qualitative evaluation of stability, our method shows impressive resilience even in the presence of external disturbances.

Original language	English
Article number	109689
Journal	Automatica
Volume	129
DOIs	https://doi.org/10.1016/j.automatica.2021.109689
Publication status	Published - 2021

Keywords

Constrained dynamic system
Data-based control
Lyapunov's method
Reinforcement learning
Uniformly ultimate boundedness stability

Access to Document

10.1016/j.automatica.2021.109689

1-s2.0-S0005109821002090-mainFinal published version, 2.67 MBLicence: CC BY

Cite this

@article{d9350bab74754ffab78b945d13a0c965,

title = "Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee",

abstract = "Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. Without using a mathematical model, an optimal controller can be learned from data evaluated by certain performance criteria through trial-and-error. However, the data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system. In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data without using a mathematical model. It is further shown how RL with UUB guarantee can be applied to control dynamic systems with safety constraints. Based on the theoretical results, both off-policy and on-policy learning algorithms are proposed respectively. As a result, optimal controllers can be learned to guarantee UUB of the closed-loop system both at convergence and during learning. The proposed algorithms are evaluated on a series of robotic continuous control tasks with safety constraints. In comparison with the existing RL algorithms, the proposed method can achieve superior performance in terms of maintaining safety. As a qualitative evaluation of stability, our method shows impressive resilience even in the presence of external disturbances.",

keywords = "Constrained dynamic system, Data-based control, Lyapunov's method, Reinforcement learning, Uniformly ultimate boundedness stability",

author = "Minghao Han and Yuan Tian and Lixian Zhang and Jun Wang and Wei Pan",

year = "2021",

doi = "10.1016/j.automatica.2021.109689",

language = "English",

volume = "129",

journal = "Automatica",

issn = "0005-1098",

publisher = "Elsevier",

}

TY - JOUR

T1 - Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee

AU - Han, Minghao

AU - Tian, Yuan

AU - Zhang, Lixian

AU - Wang, Jun

AU - Pan, Wei

PY - 2021

Y1 - 2021

N2 - Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. Without using a mathematical model, an optimal controller can be learned from data evaluated by certain performance criteria through trial-and-error. However, the data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system. In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data without using a mathematical model. It is further shown how RL with UUB guarantee can be applied to control dynamic systems with safety constraints. Based on the theoretical results, both off-policy and on-policy learning algorithms are proposed respectively. As a result, optimal controllers can be learned to guarantee UUB of the closed-loop system both at convergence and during learning. The proposed algorithms are evaluated on a series of robotic continuous control tasks with safety constraints. In comparison with the existing RL algorithms, the proposed method can achieve superior performance in terms of maintaining safety. As a qualitative evaluation of stability, our method shows impressive resilience even in the presence of external disturbances.

AB - Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. Without using a mathematical model, an optimal controller can be learned from data evaluated by certain performance criteria through trial-and-error. However, the data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system. In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data without using a mathematical model. It is further shown how RL with UUB guarantee can be applied to control dynamic systems with safety constraints. Based on the theoretical results, both off-policy and on-policy learning algorithms are proposed respectively. As a result, optimal controllers can be learned to guarantee UUB of the closed-loop system both at convergence and during learning. The proposed algorithms are evaluated on a series of robotic continuous control tasks with safety constraints. In comparison with the existing RL algorithms, the proposed method can achieve superior performance in terms of maintaining safety. As a qualitative evaluation of stability, our method shows impressive resilience even in the presence of external disturbances.

KW - Constrained dynamic system

KW - Data-based control

KW - Lyapunov's method

KW - Reinforcement learning

KW - Uniformly ultimate boundedness stability

UR - http://www.scopus.com/inward/record.url?scp=85107396615&partnerID=8YFLogxK

U2 - 10.1016/j.automatica.2021.109689

DO - 10.1016/j.automatica.2021.109689

M3 - Article

AN - SCOPUS:85107396615

SN - 0005-1098

VL - 129

JO - Automatica

JF - Automatica

M1 - 109689

ER -

Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this