Abstract
Advancing autonomous spacecraft proximity maneuvers and docking (PMD) is crucial for enhancing the efficiency and safety of inter-satellite services. One primary challenge in PMD is the accurate a priori definition of the system model, often complicated by inherent uncertainties in the system modeling and observational data. To address this challenge, we propose a novel Lyapunov Bayesian actor-critic reinforcement learning algorithm that guarantees the stability of the control policy under uncertainty. The PMD task is formulated as a Markov decision process that involves the relative dynamic model, the docking cone, and the cost function. By applying Lyapunov theory, we reformulate temporal difference learning as a constrained Gaussian process regression, enabling the state-value function to act as a Lyapunov function. Additionally, the proposed Bayesian quadrature policy optimization method analytically computes policy gradients, effectively addressing stability constraints while accommodating informational uncertainties in the PMD task. Experimental validation on a spacecraft air-bearing testbed demonstrates the significant and promising performance of the proposed algorithm.
| Original language | English |
|---|---|
| Article number | 109474 |
| Number of pages | 10 |
| Journal | Aerospace Science and Technology |
| Volume | 154 |
| DOIs | |
| Publication status | Published - 2024 |
Keywords
- Bayesian quadrature policy optimization
- Proximity maneuvers and docking
- Reinforcement learning
Fingerprint
Dive into the research topics of 'Bayesian quadrature policy optimization for spacecraft proximity maneuvers and docking'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver