Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

Fahri Wisnu Murti; Samad Ali; George Iosifidis; Matti Latva-aho

doi:10.1109/TNSM.2023.3292713

Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

Fahri Wisnu Murti, Samad Ali, George Iosifidis, Matti Latva-aho

Networked Systems

Research output: Contribution to journal › Article › Scientific › peer-review

14 Downloads (Pure)

Abstract

Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. In the first step, testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It also offers significant cost savings by up to 59% of a static benchmark, 35% of Deep Deterministic Policy Gradient with discretization, and 76% of non-branching D3QN.

Original language	English
Pages (from-to)	200-216
Number of pages	17
Journal	IEEE Transactions on Network and Service Management
Volume	21
Issue number	1
DOIs	https://doi.org/10.1109/TNSM.2023.3292713
Publication status	Published - 2024

Keywords

action branching
Computational modeling
Computer architecture
Costs
D3QN
Data models
deep reinforcement learning
Load modeling
network virtualization
Neural networks
O-RAN
orchestration
Radio access networks (RANs)
Routing

Access to Document

10.1109/TNSM.2023.3292713

Deep_Reinforcement_Learning_for_Orchestrating_Cost-Aware_Reconfigurations_of_vRANsFinal published version, 3.16 MBLicence: CC BY

Cite this

@article{10311077230446cdb9e20eb920b2295f,

title = "Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs",

abstract = "Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. In the first step, testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It also offers significant cost savings by up to 59% of a static benchmark, 35% of Deep Deterministic Policy Gradient with discretization, and 76% of non-branching D3QN.",

keywords = "action branching, Computational modeling, Computer architecture, Costs, D3QN, Data models, deep reinforcement learning, Load modeling, network virtualization, Neural networks, O-RAN, orchestration, Radio access networks (RANs), Routing",

author = "Murti, {Fahri Wisnu} and Samad Ali and George Iosifidis and Matti Latva-aho",

year = "2024",

doi = "10.1109/TNSM.2023.3292713",

language = "English",

volume = "21",

pages = "200--216",

journal = "IEEE Transactions on Network and Service Management",

issn = "1932-4537",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "1",

}

TY - JOUR

T1 - Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

AU - Murti, Fahri Wisnu

AU - Ali, Samad

AU - Iosifidis, George

AU - Latva-aho, Matti

PY - 2024

Y1 - 2024

N2 - Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. In the first step, testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It also offers significant cost savings by up to 59% of a static benchmark, 35% of Deep Deterministic Policy Gradient with discretization, and 76% of non-branching D3QN.

AB - Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. In the first step, testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It also offers significant cost savings by up to 59% of a static benchmark, 35% of Deep Deterministic Policy Gradient with discretization, and 76% of non-branching D3QN.

KW - action branching

KW - Computational modeling

KW - Computer architecture

KW - Costs

KW - D3QN

KW - Data models

KW - deep reinforcement learning

KW - Load modeling

KW - network virtualization

KW - Neural networks

KW - O-RAN

KW - orchestration

KW - Radio access networks (RANs)

KW - Routing

UR - http://www.scopus.com/inward/record.url?scp=85164451510&partnerID=8YFLogxK

U2 - 10.1109/TNSM.2023.3292713

DO - 10.1109/TNSM.2023.3292713

M3 - Article

AN - SCOPUS:85164451510

SN - 1932-4537

VL - 21

SP - 200

EP - 216

JO - IEEE Transactions on Network and Service Management

JF - IEEE Transactions on Network and Service Management

IS - 1

ER -

Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this