Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle

D.J. Groot; M.J. Ribeiro; Joost Ellerbroek; J.M. Hoekstra

Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle

D.J. Groot, M.J. Ribeiro, Joost Ellerbroek, J.M. Hoekstra

Research output: Contribution to conference › Paper › peer-review

25 Downloads (Pure)

Abstract

The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different conflict resolution methods. One of the main disadvantages of analytical conflict resolution methods, in high-traffic density scenarios, is that they can cause instabilities of the airspace due to a domino effect of secondary conflicts. Therefore, many studies have also investigated other methods of conflict resolution, such as Deep Reinforcement Learning, which have shown positive results, but tend to be hard to explain due to their black-box nature. This paper investigates if it is possible to explain the behaviour of a Soft Actor-Critic model, trained for resolving vertical conflicts in a layered urban airspace, by interpreting the policy through a heat map of the selected actions. It was found that the model actively changes its policy depending on the degrees of freedom and has a tendency to adopt preventive behaviour on top of conflict resolution. This behaviour can be directly linked to a decrease in secondary conflicts when compared to analytical methods and can potentially be incorporated into these methods to improve them while maintaining explainability.

Original language	English
Number of pages	9
Publication status	Published - 2023
Event	13th SESAR Innovation Days - Sevilla, Spain Duration: 27 Nov 2023 → 30 Nov 2023 Conference number: 13

Conference

Conference	13th SESAR Innovation Days
Country/Territory	Spain
City	Sevilla
Period	27/11/23 → 30/11/23

Keywords

Air Traffic Control
Unmanned Traffic Management
Reinforcement Learning
Policy Analysis
Artificial Intelligence
Explainable AI

Access to Document

SIDS_2023_Policy_Analysis_finalFinal published version, 2.83 MB

Cite this

@conference{e5237f77f0e94ec5a8f4102d03550784,

title = "Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle",

abstract = "The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different conflict resolution methods. One of the main disadvantages of analytical conflict resolution methods, in high-traffic density scenarios, is that they can cause instabilities of the airspace due to a domino effect of secondary conflicts. Therefore, many studies have also investigated other methods of conflict resolution, such as Deep Reinforcement Learning, which have shown positive results, but tend to be hard to explain due to their black-box nature. This paper investigates if it is possible to explain the behaviour of a Soft Actor-Critic model, trained for resolving vertical conflicts in a layered urban airspace, by interpreting the policy through a heat map of the selected actions. It was found that the model actively changes its policy depending on the degrees of freedom and has a tendency to adopt preventive behaviour on top of conflict resolution. This behaviour can be directly linked to a decrease in secondary conflicts when compared to analytical methods and can potentially be incorporated into these methods to improve them while maintaining explainability.",

keywords = "Air Traffic Control, Unmanned Traffic Management, Reinforcement Learning, Policy Analysis, Artificial Intelligence, Explainable AI",

author = "D.J. Groot and M.J. Ribeiro and Joost Ellerbroek and J.M. Hoekstra",

year = "2023",

language = "English",

note = "13th SESAR Innovation Days ; Conference date: 27-11-2023 Through 30-11-2023",

}

TY - CONF

T1 - Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle

AU - Groot, D.J.

AU - Ribeiro, M.J.

AU - Ellerbroek, Joost

AU - Hoekstra, J.M.

N1 - Conference code: 13

PY - 2023

Y1 - 2023

N2 - The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different conflict resolution methods. One of the main disadvantages of analytical conflict resolution methods, in high-traffic density scenarios, is that they can cause instabilities of the airspace due to a domino effect of secondary conflicts. Therefore, many studies have also investigated other methods of conflict resolution, such as Deep Reinforcement Learning, which have shown positive results, but tend to be hard to explain due to their black-box nature. This paper investigates if it is possible to explain the behaviour of a Soft Actor-Critic model, trained for resolving vertical conflicts in a layered urban airspace, by interpreting the policy through a heat map of the selected actions. It was found that the model actively changes its policy depending on the degrees of freedom and has a tendency to adopt preventive behaviour on top of conflict resolution. This behaviour can be directly linked to a decrease in secondary conflicts when compared to analytical methods and can potentially be incorporated into these methods to improve them while maintaining explainability.

AB - The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different conflict resolution methods. One of the main disadvantages of analytical conflict resolution methods, in high-traffic density scenarios, is that they can cause instabilities of the airspace due to a domino effect of secondary conflicts. Therefore, many studies have also investigated other methods of conflict resolution, such as Deep Reinforcement Learning, which have shown positive results, but tend to be hard to explain due to their black-box nature. This paper investigates if it is possible to explain the behaviour of a Soft Actor-Critic model, trained for resolving vertical conflicts in a layered urban airspace, by interpreting the policy through a heat map of the selected actions. It was found that the model actively changes its policy depending on the degrees of freedom and has a tendency to adopt preventive behaviour on top of conflict resolution. This behaviour can be directly linked to a decrease in secondary conflicts when compared to analytical methods and can potentially be incorporated into these methods to improve them while maintaining explainability.

KW - Air Traffic Control

KW - Unmanned Traffic Management

KW - Reinforcement Learning

KW - Policy Analysis

KW - Artificial Intelligence

KW - Explainable AI

M3 - Paper

T2 - 13th SESAR Innovation Days

Y2 - 27 November 2023 through 30 November 2023

ER -

Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this