Determining Optimal Conflict Avoidance Manoeuvres At High Densities With Reinforcement Learning

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientific

62 Downloads (Pure)


The use of drones for applications such as package delivery, in an urban setting, would result in traffic densities that are orders of magnitude higher than any observed in manned aviation. Current geometric resolution models have proven to be very efficient at relatively moderate densities. However, at higher densities, performance is hindered by the unpredictable emergent behaviour from neighbouring aircraft. In this paper, we use a hybrid solution between existing geometric resolution approaches and reinforcement learning (RL), directed at improving conflict resolution performance at high densities. We resort to a Deep Deterministic Policy Gradient (DDPG) model to improve the behaviour of the Modified Voltage Potential (MVP) geometric conflict resolution method. By default, the MVP method generates avoidance manoeuvres of a geometrically-defined type, using a fixed look-ahead time. In the current study, we instead aim to use RL to determine the values for these variables, based on intruder position and traffic density. The analysis in this paper specifically addresses the difficulty of training algorithms in a cooperative multi-agent case to converge to optimal values. We prove that finding the right representation of state/rewards in a nonstationary environment is non-trivial and highly influences the learning process. Finally, we show that a variation of resolution manoeuvres can improve the safety of several scenarios at high traffic densities.
Original languageEnglish
Title of host publication10th SESAR Innovation Days
Number of pages8
Publication statusPublished - 2020
Event10th SESAR Innovation Days - Virtual/online event due to COVID-19
Duration: 7 Dec 202010 Dec 2020


Conference10th SESAR Innovation Days

Bibliographical note

Virtual/online event due to COVID-19


  • Conflict Detection and Resolution (CD&R)
  • Reinforcement Leaning (RL)
  • ), Deep Deterministic Policy Gradient (DDPG)
  • U-Space
  • Unmanned Traffic Management (UTM)
  • Modified Voltage Potential (MVP)
  • BlueSky
  • ATC Simulator


Dive into the research topics of 'Determining Optimal Conflict Avoidance Manoeuvres At High Densities With Reinforcement Learning'. Together they form a unique fingerprint.

Cite this