Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning

H Modares; Subramanya Nageshrao; Gabriel Delgado Lopes; Robert Babuska; FL Lewis

doi:10.1016/j.automatica.2016.05.017

Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning

H Modares, Subramanya Nageshrao, Gabriel Delgado Lopes, Robert Babuska, FL Lewis

Research output: Contribution to journal › Article › Scientific › peer-review

131 Citations (Scopus)

Abstract

This paper considers optimal output synchronization of heterogeneous linear multi-agent systems. Standard approaches to output synchronization of heterogeneous systems require either the solution of the output regulator equations or the incorporation of a p-copy of the leader’s dynamics in the controller of each agent. By contrast, in this paper neither one is needed. Moreover, here both the leader’s and the follower’s dynamics are assumed to be unknown. First, a distributed adaptive observer is designed to estimate the leader’s state for each agent. The output synchronization problem is then formulated as an optimal control problem and a novel model-free off-policy reinforcement learning algorithm is developed to solve the optimal output synchronization problem online in real time. It is shown that this optimal distributed approach implicitly solves the output regulation equations without actually doing so.
Simulation results are provided to verify the effectiveness of the proposed approach.

Original language	English
Pages (from-to)	334-341
Journal	Automatica
Volume	71
DOIs	https://doi.org/10.1016/j.automatica.2016.05.017
Publication status	Published - 2016

Keywords

Output synchronization
Heterogeneous systems
Reinforcement learning
Leader–follower systems

Access to Document

10.1016/j.automatica.2016.05.017

Cite this

@article{284be25e11014cef94f818b706fb6ffc,

title = "Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning",

abstract = "This paper considers optimal output synchronization of heterogeneous linear multi-agent systems. Standard approaches to output synchronization of heterogeneous systems require either the solution of the output regulator equations or the incorporation of a p-copy of the leader{\textquoteright}s dynamics in the controller of each agent. By contrast, in this paper neither one is needed. Moreover, here both the leader{\textquoteright}s and the follower{\textquoteright}s dynamics are assumed to be unknown. First, a distributed adaptive observer is designed to estimate the leader{\textquoteright}s state for each agent. The output synchronization problem is then formulated as an optimal control problem and a novel model-free off-policy reinforcement learning algorithm is developed to solve the optimal output synchronization problem online in real time. It is shown that this optimal distributed approach implicitly solves the output regulation equations without actually doing so.Simulation results are provided to verify the effectiveness of the proposed approach.",

keywords = "Output synchronization, Heterogeneous systems, Reinforcement learning, Leader–follower systems",

author = "H Modares and Subramanya Nageshrao and {Delgado Lopes}, Gabriel and Robert Babuska and FL Lewis",

year = "2016",

doi = "10.1016/j.automatica.2016.05.017",

language = "English",

volume = "71",

pages = "334--341",

journal = "Automatica",

issn = "0005-1098",

publisher = "Elsevier",

}

TY - JOUR

T1 - Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning

AU - Modares, H

AU - Nageshrao, Subramanya

AU - Delgado Lopes, Gabriel

AU - Babuska, Robert

AU - Lewis, FL

PY - 2016

Y1 - 2016

N2 - This paper considers optimal output synchronization of heterogeneous linear multi-agent systems. Standard approaches to output synchronization of heterogeneous systems require either the solution of the output regulator equations or the incorporation of a p-copy of the leader’s dynamics in the controller of each agent. By contrast, in this paper neither one is needed. Moreover, here both the leader’s and the follower’s dynamics are assumed to be unknown. First, a distributed adaptive observer is designed to estimate the leader’s state for each agent. The output synchronization problem is then formulated as an optimal control problem and a novel model-free off-policy reinforcement learning algorithm is developed to solve the optimal output synchronization problem online in real time. It is shown that this optimal distributed approach implicitly solves the output regulation equations without actually doing so.Simulation results are provided to verify the effectiveness of the proposed approach.

AB - This paper considers optimal output synchronization of heterogeneous linear multi-agent systems. Standard approaches to output synchronization of heterogeneous systems require either the solution of the output regulator equations or the incorporation of a p-copy of the leader’s dynamics in the controller of each agent. By contrast, in this paper neither one is needed. Moreover, here both the leader’s and the follower’s dynamics are assumed to be unknown. First, a distributed adaptive observer is designed to estimate the leader’s state for each agent. The output synchronization problem is then formulated as an optimal control problem and a novel model-free off-policy reinforcement learning algorithm is developed to solve the optimal output synchronization problem online in real time. It is shown that this optimal distributed approach implicitly solves the output regulation equations without actually doing so.Simulation results are provided to verify the effectiveness of the proposed approach.

KW - Output synchronization

KW - Heterogeneous systems

KW - Reinforcement learning

KW - Leader–follower systems

U2 - 10.1016/j.automatica.2016.05.017

DO - 10.1016/j.automatica.2016.05.017

M3 - Article

SN - 0005-1098

VL - 71

SP - 334

EP - 341

JO - Automatica

JF - Automatica

ER -

Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning

Abstract

Keywords

Access to Document

Fingerprint

Cite this