Learning-based resilience guarantee for multi-UAV collaborative QoS management

Chengchao Bai; Peng Yan; Xiaoqiang Yu; Jifeng Guo

doi:10.1016/j.patcog.2021.108166

Learning-based resilience guarantee for multi-UAV collaborative QoS management

Chengchao Bai^*, Peng Yan, Xiaoqiang Yu, Jifeng Guo

^*Corresponding author for this work

Robot Dynamics

Research output: Contribution to journal › Article › Scientific › peer-review

3 Citations (Scopus)

7 Downloads (Pure)

Abstract

Unmanned and intelligent technologies are the future development trend in the business field. It is of great significance for the connotation analysis and application characterization of massive interactive data. Particularly, during major epidemics or disasters, how to provide business services safely and securely is crucial. Specifically, providing users with resilient and guaranteed communication services is a challenging business task when the communication facilities are damaged. Unmanned aerial vehicles (UAVs), with flexible deployment and high maneuverability, can be used to serve as aerial base stations (BSs) to establish emergency networks. However, it is challenging to control multiple UAVs to provide efficient and fair communication quality of service (QoS) to users due to their limited communication service capabilities. In this paper, we propose a learning-based resilience guarantee framework for multi-UAV collaborative QoS management. We formulate this problem as a partial observable Markov decision process and solve it with proximal policy optimization (PPO), which is a policy-based deep reinforcement learning method. A centralized training and decentralized execution paradigm is used, where the experience collected by all UAVs is used to train the shared control policy. Each UAV takes actions based on the partial environment information it observes. In addition, the design of the reward function considers the average and variance of the communication QoS of all users. Extensive simulations are conducted for performance evaluation. The simulation results indicate that (1) the trained policies can adapt to different scenarios and provide resilient and guaranteed communication QoS to users, (2) increasing the number of UAVs can compensate for the lack of service capabilities of UAVs, (3) when UAVs have local communication service capabilities, the policies trained with PPO have better performance compared with the policies trained with other algorithms.

Original language	English
Article number	108166
Number of pages	13
Journal	Pattern Recognition
Volume	122
DOIs	https://doi.org/10.1016/j.patcog.2021.108166
Publication status	Published - 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

Communication service
Deep reinforcement learning
Multi-UAV
QoS-aware
System resilience
Unmanned business

Access to Document

10.1016/j.patcog.2021.108166

1-s2.0-S0031320321003538-mainFinal published version, 2.75 MB

Cite this

@article{76f38a0f437d491f937beaa8a1966a74,

title = "Learning-based resilience guarantee for multi-UAV collaborative QoS management",

abstract = "Unmanned and intelligent technologies are the future development trend in the business field. It is of great significance for the connotation analysis and application characterization of massive interactive data. Particularly, during major epidemics or disasters, how to provide business services safely and securely is crucial. Specifically, providing users with resilient and guaranteed communication services is a challenging business task when the communication facilities are damaged. Unmanned aerial vehicles (UAVs), with flexible deployment and high maneuverability, can be used to serve as aerial base stations (BSs) to establish emergency networks. However, it is challenging to control multiple UAVs to provide efficient and fair communication quality of service (QoS) to users due to their limited communication service capabilities. In this paper, we propose a learning-based resilience guarantee framework for multi-UAV collaborative QoS management. We formulate this problem as a partial observable Markov decision process and solve it with proximal policy optimization (PPO), which is a policy-based deep reinforcement learning method. A centralized training and decentralized execution paradigm is used, where the experience collected by all UAVs is used to train the shared control policy. Each UAV takes actions based on the partial environment information it observes. In addition, the design of the reward function considers the average and variance of the communication QoS of all users. Extensive simulations are conducted for performance evaluation. The simulation results indicate that (1) the trained policies can adapt to different scenarios and provide resilient and guaranteed communication QoS to users, (2) increasing the number of UAVs can compensate for the lack of service capabilities of UAVs, (3) when UAVs have local communication service capabilities, the policies trained with PPO have better performance compared with the policies trained with other algorithms.",

keywords = "Communication service, Deep reinforcement learning, Multi-UAV, QoS-aware, System resilience, Unmanned business",

author = "Chengchao Bai and Peng Yan and Xiaoqiang Yu and Jifeng Guo",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2022",

doi = "10.1016/j.patcog.2021.108166",

language = "English",

volume = "122",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier",

}

TY - JOUR

T1 - Learning-based resilience guarantee for multi-UAV collaborative QoS management

AU - Bai, Chengchao

AU - Yan, Peng

AU - Yu, Xiaoqiang

AU - Guo, Jifeng

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - Unmanned and intelligent technologies are the future development trend in the business field. It is of great significance for the connotation analysis and application characterization of massive interactive data. Particularly, during major epidemics or disasters, how to provide business services safely and securely is crucial. Specifically, providing users with resilient and guaranteed communication services is a challenging business task when the communication facilities are damaged. Unmanned aerial vehicles (UAVs), with flexible deployment and high maneuverability, can be used to serve as aerial base stations (BSs) to establish emergency networks. However, it is challenging to control multiple UAVs to provide efficient and fair communication quality of service (QoS) to users due to their limited communication service capabilities. In this paper, we propose a learning-based resilience guarantee framework for multi-UAV collaborative QoS management. We formulate this problem as a partial observable Markov decision process and solve it with proximal policy optimization (PPO), which is a policy-based deep reinforcement learning method. A centralized training and decentralized execution paradigm is used, where the experience collected by all UAVs is used to train the shared control policy. Each UAV takes actions based on the partial environment information it observes. In addition, the design of the reward function considers the average and variance of the communication QoS of all users. Extensive simulations are conducted for performance evaluation. The simulation results indicate that (1) the trained policies can adapt to different scenarios and provide resilient and guaranteed communication QoS to users, (2) increasing the number of UAVs can compensate for the lack of service capabilities of UAVs, (3) when UAVs have local communication service capabilities, the policies trained with PPO have better performance compared with the policies trained with other algorithms.

AB - Unmanned and intelligent technologies are the future development trend in the business field. It is of great significance for the connotation analysis and application characterization of massive interactive data. Particularly, during major epidemics or disasters, how to provide business services safely and securely is crucial. Specifically, providing users with resilient and guaranteed communication services is a challenging business task when the communication facilities are damaged. Unmanned aerial vehicles (UAVs), with flexible deployment and high maneuverability, can be used to serve as aerial base stations (BSs) to establish emergency networks. However, it is challenging to control multiple UAVs to provide efficient and fair communication quality of service (QoS) to users due to their limited communication service capabilities. In this paper, we propose a learning-based resilience guarantee framework for multi-UAV collaborative QoS management. We formulate this problem as a partial observable Markov decision process and solve it with proximal policy optimization (PPO), which is a policy-based deep reinforcement learning method. A centralized training and decentralized execution paradigm is used, where the experience collected by all UAVs is used to train the shared control policy. Each UAV takes actions based on the partial environment information it observes. In addition, the design of the reward function considers the average and variance of the communication QoS of all users. Extensive simulations are conducted for performance evaluation. The simulation results indicate that (1) the trained policies can adapt to different scenarios and provide resilient and guaranteed communication QoS to users, (2) increasing the number of UAVs can compensate for the lack of service capabilities of UAVs, (3) when UAVs have local communication service capabilities, the policies trained with PPO have better performance compared with the policies trained with other algorithms.

KW - Communication service

KW - Deep reinforcement learning

KW - Multi-UAV

KW - QoS-aware

KW - System resilience

KW - Unmanned business

UR - http://www.scopus.com/inward/record.url?scp=85116907698&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2021.108166

DO - 10.1016/j.patcog.2021.108166

M3 - Article

AN - SCOPUS:85116907698

SN - 0031-3203

VL - 122

JO - Pattern Recognition

JF - Pattern Recognition

M1 - 108166

ER -

Learning-based resilience guarantee for multi-UAV collaborative QoS management

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this