Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking

Desong Du, Yanfang Liu, Ouyang Zhang, Naiming Qi, Weiran Yao, Wei Pan*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

21 Downloads (Pure)

Abstract

Advancing autonomous spacecraft proximity maneuvers and docking (PMD) is crucial for enhancing the efficiency and safety of inter-satellite services. One primary challenge in PMD is the accurate a priori definition of the system model, often complicated by inherent uncertainties in the system modeling and observational data. To address this challenge, we propose a novel Lyapunov Bayesian actor-critic reinforcement learning algorithm that guarantees the stability of the control policy under uncertainty. The PMD task is formulated as a Markov decision process that involves the relative dynamic model, the docking cone, and the cost function. By applying Lyapunov theory, we reformulate temporal difference learning as a constrained Gaussian process regression, enabling the state-value function to act as a Lyapunov function. Additionally, the proposed Bayesian quadrature policy optimization method analytically computes policy gradients, effectively addressing stability constraints while accommodating informational uncertainties in the PMD task. Experimental validation on a spacecraft air-bearing testbed demonstrates the significant and promising performance of the proposed algorithm.
Original languageEnglish
Article number109474
Number of pages10
JournalAerospace Science and Technology
Volume154
DOIs
Publication statusPublished - 2024

Keywords

  • Bayesian quadrature policy optimization
  • Proximity maneuvers and docking
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking'. Together they form a unique fingerprint.

Cite this