Online reinforcement learning control for aerospace systems

Ye Zhou

doi:10.4233/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4

Online reinforcement learning control for aerospace systems

Ye Zhou

Control & Simulation

Research output: Thesis › Dissertation (TU Delft)

127 Downloads (Pure)

Abstract

Reinforcement Learning (RL) methods are relatively new in the field of aerospace guidance, navigation, and control. This dissertation aims to exploit RL methods to improve the autonomy and online learning of aerospace systems with respect to the a priori unknown system and environment, dynamical uncertainties, and partial observability. In the first part of this dissertation, incremental Approximate Dynamic Programming (iADP) methods are proposed. Instead of using nonlinear function approximators to approximate the true cost-to-go, iADP methods use an (extended) incremental model to deal with the nonlinearity of unknown systems and uncertainties of the environment. In the second part, online Adaptive Critic Designs (ACDs) are proposed based on the incremental model. This method replaces the global system model approximator with an incremental model. This approach, therefore, does not need off-line training stages and may accelerate online learning. In the third part, the hybrid Hierarchical Reinforcement Learning (hHRL) method is proposed for guidance and navigation problems. This method consists of several hierarchical levels, where each level uses different methods to optimize the learning with different types of information and objectives. In conclusion, this dissertation contributes with several methods that improve the intelligence and autonomy of aerospace systems. These improvements are mainly from three perspectives: 1) enhancing the adaptability and efficiency of low-level control, 2) improving the intelligence and online learning ability of guidance, navigation, and control, and 3) creating a well-organized hierarchy to ensure coordination between each level. The proposed methods provide novel insights for both the reinforcement learning research community and for developers of aerospace automatic control system.

Original language	English
Awarding Institution	Delft University of Technology
Supervisors/Advisors	Mulder, M., Supervisor Chu, Q., Supervisor
Award date	11 Apr 2018
Electronic ISBNs	978-94-6366-021-1
DOIs	https://doi.org/10.4233/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4
Publication status	Published - 2018

Keywords

Reinforcement Learning
Aerospace Systems
Optimal Adaptive Control
Approximate Dynamic Programming
Adaptive Critic Designs
Incremental Model
Nonlinear Systems
Partial Observability
Hierarchical Reinforcement Learning
HybridMethods

Access to Document

10.4233/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4

Dissertation_Ye ZhouFinal published version, 3.74 MB

Cite this

@phdthesis{5b87591525184ec8a1a007ad057edab4,

title = "Online reinforcement learning control for aerospace systems",

abstract = "Reinforcement Learning (RL) methods are relatively new in the field of aerospace guidance, navigation, and control. This dissertation aims to exploit RL methods to improve the autonomy and online learning of aerospace systems with respect to the a priori unknown system and environment, dynamical uncertainties, and partial observability. In the first part of this dissertation, incremental Approximate Dynamic Programming (iADP) methods are proposed. Instead of using nonlinear function approximators to approximate the true cost-to-go, iADP methods use an (extended) incremental model to deal with the nonlinearity of unknown systems and uncertainties of the environment. In the second part, online Adaptive Critic Designs (ACDs) are proposed based on the incremental model. This method replaces the global system model approximator with an incremental model. This approach, therefore, does not need off-line training stages and may accelerate online learning. In the third part, the hybrid Hierarchical Reinforcement Learning (hHRL) method is proposed for guidance and navigation problems. This method consists of several hierarchical levels, where each level uses different methods to optimize the learning with different types of information and objectives. In conclusion, this dissertation contributes with several methods that improve the intelligence and autonomy of aerospace systems. These improvements are mainly from three perspectives: 1) enhancing the adaptability and efficiency of low-level control, 2) improving the intelligence and online learning ability of guidance, navigation, and control, and 3) creating a well-organized hierarchy to ensure coordination between each level. The proposed methods provide novel insights for both the reinforcement learning research community and for developers of aerospace automatic control system. ",

keywords = "Reinforcement Learning, Aerospace Systems, Optimal Adaptive Control, Approximate Dynamic Programming, Adaptive Critic Designs, Incremental Model, Nonlinear Systems, Partial Observability, Hierarchical Reinforcement Learning, HybridMethods",

author = "Ye Zhou",

year = "2018",

doi = "10.4233/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4",

language = "English",

type = "Dissertation (TU Delft)",

school = "Delft University of Technology",

}

TY - THES

T1 - Online reinforcement learning control for aerospace systems

AU - Zhou, Ye

PY - 2018

Y1 - 2018

N2 - Reinforcement Learning (RL) methods are relatively new in the field of aerospace guidance, navigation, and control. This dissertation aims to exploit RL methods to improve the autonomy and online learning of aerospace systems with respect to the a priori unknown system and environment, dynamical uncertainties, and partial observability. In the first part of this dissertation, incremental Approximate Dynamic Programming (iADP) methods are proposed. Instead of using nonlinear function approximators to approximate the true cost-to-go, iADP methods use an (extended) incremental model to deal with the nonlinearity of unknown systems and uncertainties of the environment. In the second part, online Adaptive Critic Designs (ACDs) are proposed based on the incremental model. This method replaces the global system model approximator with an incremental model. This approach, therefore, does not need off-line training stages and may accelerate online learning. In the third part, the hybrid Hierarchical Reinforcement Learning (hHRL) method is proposed for guidance and navigation problems. This method consists of several hierarchical levels, where each level uses different methods to optimize the learning with different types of information and objectives. In conclusion, this dissertation contributes with several methods that improve the intelligence and autonomy of aerospace systems. These improvements are mainly from three perspectives: 1) enhancing the adaptability and efficiency of low-level control, 2) improving the intelligence and online learning ability of guidance, navigation, and control, and 3) creating a well-organized hierarchy to ensure coordination between each level. The proposed methods provide novel insights for both the reinforcement learning research community and for developers of aerospace automatic control system.

AB - Reinforcement Learning (RL) methods are relatively new in the field of aerospace guidance, navigation, and control. This dissertation aims to exploit RL methods to improve the autonomy and online learning of aerospace systems with respect to the a priori unknown system and environment, dynamical uncertainties, and partial observability. In the first part of this dissertation, incremental Approximate Dynamic Programming (iADP) methods are proposed. Instead of using nonlinear function approximators to approximate the true cost-to-go, iADP methods use an (extended) incremental model to deal with the nonlinearity of unknown systems and uncertainties of the environment. In the second part, online Adaptive Critic Designs (ACDs) are proposed based on the incremental model. This method replaces the global system model approximator with an incremental model. This approach, therefore, does not need off-line training stages and may accelerate online learning. In the third part, the hybrid Hierarchical Reinforcement Learning (hHRL) method is proposed for guidance and navigation problems. This method consists of several hierarchical levels, where each level uses different methods to optimize the learning with different types of information and objectives. In conclusion, this dissertation contributes with several methods that improve the intelligence and autonomy of aerospace systems. These improvements are mainly from three perspectives: 1) enhancing the adaptability and efficiency of low-level control, 2) improving the intelligence and online learning ability of guidance, navigation, and control, and 3) creating a well-organized hierarchy to ensure coordination between each level. The proposed methods provide novel insights for both the reinforcement learning research community and for developers of aerospace automatic control system.

KW - Reinforcement Learning

KW - Aerospace Systems

KW - Optimal Adaptive Control

KW - Approximate Dynamic Programming

KW - Adaptive Critic Designs

KW - Incremental Model

KW - Nonlinear Systems

KW - Partial Observability

KW - Hierarchical Reinforcement Learning

KW - HybridMethods

UR - http://resolver.tudelft.nl/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4

U2 - 10.4233/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4

DO - 10.4233/uuid:5b875915-2518-4ec8-a1a0-07ad057edab4

M3 - Dissertation (TU Delft)

ER -

Online reinforcement learning control for aerospace systems

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this