Approximate dynamic programming for constrained linear systems: A piecewise quadratic approximation approach

Kanghui He; Shengling Shi; Ton van den Boom; Bart De Schutter

doi:10.1016/j.automatica.2023.111456

Approximate dynamic programming for constrained linear systems: A piecewise quadratic approximation approach

Kanghui He^*, Shengling Shi, Ton van den Boom, Bart De Schutter

^*Corresponding author for this work

Research output: Contribution to journal › Article › Scientific › peer-review

38 Downloads (Pure)

Abstract

Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies to overcome their individual limitations. The predictive control law for constrained linear quadratic regulation (CLQR) problems has been proven to be piecewise affine (PWA) while the value function is piecewise quadratic. We exploit these formal results from MPC to design an ADP method for CLQR problems with a known model. A novel convex and piecewise quadratic neural network with a local–global architecture is proposed to provide an accurate approximation of the value function, which is used as the cost-to-go function in the online dynamic programming problem. An efficient decomposition algorithm is developed to generate the control policy and speed up the online computation. Rigorous stability analysis of the closed-loop system is conducted for the proposed control scheme under the condition that a good approximation of the value function is achieved. Comparative simulations are carried out to demonstrate the potential of the proposed method in terms of online computation and optimality.

Original language	English
Article number	111456
Number of pages	9
Journal	Automatica
Volume	160
DOIs	https://doi.org/10.1016/j.automatica.2023.111456
Publication status	Published - 2024

Funding

This paper is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 101018826 - CLariNet). The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Alessandro Abate under the direction of Editor Ian R. Petersen.

This paper is part of a project that has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (Grant agreement No. 101018826 - CLariNet). The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Alessandro Abate under the direction of Editor Ian R. Petersen.

Keywords

Approximate dynamic programming
Constrained linear quadratic regulation
Model predictive control
Neural networks
Reinforcement learning
Value function approximation

Access to Document

10.1016/j.automatica.2023.111456

1-s2.0-S0005109823006234-mainFinal published version, 1.02 MBLicence: CC BY

Cite this

@article{3f015c0df06e4c04a2f034b28640141a,

title = "Approximate dynamic programming for constrained linear systems: A piecewise quadratic approximation approach",

abstract = "Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies to overcome their individual limitations. The predictive control law for constrained linear quadratic regulation (CLQR) problems has been proven to be piecewise affine (PWA) while the value function is piecewise quadratic. We exploit these formal results from MPC to design an ADP method for CLQR problems with a known model. A novel convex and piecewise quadratic neural network with a local–global architecture is proposed to provide an accurate approximation of the value function, which is used as the cost-to-go function in the online dynamic programming problem. An efficient decomposition algorithm is developed to generate the control policy and speed up the online computation. Rigorous stability analysis of the closed-loop system is conducted for the proposed control scheme under the condition that a good approximation of the value function is achieved. Comparative simulations are carried out to demonstrate the potential of the proposed method in terms of online computation and optimality.",

keywords = "Approximate dynamic programming, Constrained linear quadratic regulation, Model predictive control, Neural networks, Reinforcement learning, Value function approximation",

author = "Kanghui He and Shengling Shi and {van den Boom}, Ton and {De Schutter}, Bart",

year = "2024",

doi = "10.1016/j.automatica.2023.111456",

language = "English",

volume = "160",

journal = "Automatica",

issn = "0005-1098",

publisher = "Elsevier",

}

TY - JOUR

T1 - Approximate dynamic programming for constrained linear systems

T2 - A piecewise quadratic approximation approach

AU - He, Kanghui

AU - Shi, Shengling

AU - van den Boom, Ton

AU - De Schutter, Bart

PY - 2024

Y1 - 2024

N2 - Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies to overcome their individual limitations. The predictive control law for constrained linear quadratic regulation (CLQR) problems has been proven to be piecewise affine (PWA) while the value function is piecewise quadratic. We exploit these formal results from MPC to design an ADP method for CLQR problems with a known model. A novel convex and piecewise quadratic neural network with a local–global architecture is proposed to provide an accurate approximation of the value function, which is used as the cost-to-go function in the online dynamic programming problem. An efficient decomposition algorithm is developed to generate the control policy and speed up the online computation. Rigorous stability analysis of the closed-loop system is conducted for the proposed control scheme under the condition that a good approximation of the value function is achieved. Comparative simulations are carried out to demonstrate the potential of the proposed method in terms of online computation and optimality.

AB - Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability guarantees, although its computation is sometimes prohibitive. This paper introduces an approach combining the two methodologies to overcome their individual limitations. The predictive control law for constrained linear quadratic regulation (CLQR) problems has been proven to be piecewise affine (PWA) while the value function is piecewise quadratic. We exploit these formal results from MPC to design an ADP method for CLQR problems with a known model. A novel convex and piecewise quadratic neural network with a local–global architecture is proposed to provide an accurate approximation of the value function, which is used as the cost-to-go function in the online dynamic programming problem. An efficient decomposition algorithm is developed to generate the control policy and speed up the online computation. Rigorous stability analysis of the closed-loop system is conducted for the proposed control scheme under the condition that a good approximation of the value function is achieved. Comparative simulations are carried out to demonstrate the potential of the proposed method in terms of online computation and optimality.

KW - Approximate dynamic programming

KW - Constrained linear quadratic regulation

KW - Model predictive control

KW - Neural networks

KW - Reinforcement learning

KW - Value function approximation

UR - http://www.scopus.com/inward/record.url?scp=85182206256&partnerID=8YFLogxK

U2 - 10.1016/j.automatica.2023.111456

DO - 10.1016/j.automatica.2023.111456

M3 - Article

AN - SCOPUS:85182206256

SN - 0005-1098

VL - 160

JO - Automatica

JF - Automatica

M1 - 111456

ER -

Approximate dynamic programming for constrained linear systems: A piecewise quadratic approximation approach

Abstract

Funding

Keywords

Access to Document

Other files and links

Fingerprint

Cite this