HER-PDQN: A Reinforcement Learning Approach for UAV Navigation with Hybrid Action Spaces and Sparse Rewards

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Downloads (Pure)

Abstract

Reinforcement learning (RL) equipped with neural networks has recently led to a wide range of successes in learning policies for unmanned aerial vehicle (UAV) navigation and control problems. The success of RL relies on two human-designed heuristics: appropriate action space definition and reward function engineering. The commonly used fully continuous or fully discrete action spaces in optimal control and decision making problems may lack control authority and remove the inherent problem structure, which can negatively affect learning performance. Besides, reward engineering requires a lot of human effort and may lead to unwanted behavior. In this paper, we address these challenges by proposing a new off-policy RL algorithm called HER-PDQN which incorporates Hindsight Experience Replay (HER) with Parameterized Deep Q-Networks (P-DQN). In simulation experiments, HER-PDQN is used to train an agent to fulfill a UAV navigation task in a 2-dimensional environment. The results indicate the effectiveness of P-DQN algorithm in dealing both with the hybrid action space and sparse rewards. This paper can be considered as the first attempt at applying RL in sparse reward setting for UAV navigation with hybrid action spaces.
Original languageEnglish
Title of host publicationAIAA SCITECH 2022 Forum
Number of pages8
ISBN (Electronic)978-1-62410-631-6
DOIs
Publication statusPublished - 2022
EventAIAA SCITECH 2022 Forum - virtual event
Duration: 3 Jan 20227 Jan 2022

Conference

ConferenceAIAA SCITECH 2022 Forum
Period3/01/227/01/22

Fingerprint

Dive into the research topics of 'HER-PDQN: A Reinforcement Learning Approach for UAV Navigation with Hybrid Action Spaces and Sparse Rewards'. Together they form a unique fingerprint.

Cite this