TY - JOUR
T1 - Incremental model based online heuristic dynamic programming for nonlinear adaptive tracking control with partial observability
AU - Zhou, Ye
AU - van Kampen, Erik Jan
AU - Chu, Qiping
PY - 2020/10/1
Y1 - 2020/10/1
N2 - Heuristic dynamic programming is a class of reinforcement learning, which has been introduced to aerospace engineering to solve nonlinear, optimal adaptive control problems. However, it requires an off-line learning stage to train a global system model to represent the system dynamics. This paper uses an incremental model in heuristic dynamic programming to improve the online learning ability, which is incremental model based heuristic dynamic programming. The trait of the online identification of the incremental model makes this method an option for fault-tolerant control and partially observable control problems. This study, therefore, also extends this method to deal with partial observability. The presented method has been validated on two different online tracking problems: missile fault-tolerant control with full-state measurements and also spacecraft attitude control disturbed with liquid sloshing under partially observable conditions. The results reveal that the proposed method outperforms the conventional heuristic dynamic programming method in fault-tolerant control tasks, deals with partial observability, and is robust to internal uncertainties and external disturbances.
AB - Heuristic dynamic programming is a class of reinforcement learning, which has been introduced to aerospace engineering to solve nonlinear, optimal adaptive control problems. However, it requires an off-line learning stage to train a global system model to represent the system dynamics. This paper uses an incremental model in heuristic dynamic programming to improve the online learning ability, which is incremental model based heuristic dynamic programming. The trait of the online identification of the incremental model makes this method an option for fault-tolerant control and partially observable control problems. This study, therefore, also extends this method to deal with partial observability. The presented method has been validated on two different online tracking problems: missile fault-tolerant control with full-state measurements and also spacecraft attitude control disturbed with liquid sloshing under partially observable conditions. The results reveal that the proposed method outperforms the conventional heuristic dynamic programming method in fault-tolerant control tasks, deals with partial observability, and is robust to internal uncertainties and external disturbances.
KW - Adaptive nonlinear flight control
KW - Heuristic dynamic programming
KW - Incremental techniques
KW - Online reinforcement learning
KW - Partial observability
UR - http://www.scopus.com/inward/record.url?scp=85086580235&partnerID=8YFLogxK
U2 - 10.1016/j.ast.2020.106013
DO - 10.1016/j.ast.2020.106013
M3 - Article
AN - SCOPUS:85086580235
SN - 1270-9638
VL - 105
JO - Aerospace Science and Technology
JF - Aerospace Science and Technology
M1 - 106013
ER -