Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learning

Charalampos Andriotis; Ziead Metwally

Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learning

Charalampos Andriotis, Ziead Metwally

Architectural Technology

Research output: Contribution to conference › Paper › peer-review

12 Downloads (Pure)

Abstract

Maintenance planning of engineering systems is typically posed as a discrete stochastic optimal control problem, as it refers to determining a series of distinct interventions that upkeep structural integrity. Advanced algorithmic schemes within the joint framework of Partially Observable Markov Decision Processes (POMDPs) and multi-agent Deep Reinforcement Learning (DRL) have been recently able to approximate well global optima for this complex problem, outperforming existing time- and condition-based decision strategies. Integral to their success is the hypothesis that system components represent individual agents who form cooperative policies to minimize a central life-cycle cost. Thereby, the policy output scales linearly with the number of components, alleviating the curse of dimensionality related to combinatorial choices. State complexity and long-term optimality are handled efficiently via deep learning and POMDP principles, respectively. However, the efficiency of multi-agent coordination can fade as the number of agents increases. To this end, we propose a new formulation: we pose the problem as a continuous-control dynamic resource allocation one, combining hierarchical DRL and mixed-integer programming. Moving from flat decentralized to hierarchical multi-agent decompositions allows us to improve further the policy output scalability. The new Adaptive Knapsack Hierarchical Resource Allocator (AK-HRA) DRL architecture distributes available resources within the system, creating local, independently solvable, multi-choice knapsack optimization problems. By design, AK HRA allows decision-makers to inscribe known hierarchical structures and local decision rules in their architectures, thereby enhancing control and interpretability over the solution space. The efficacy of the new approach is demonstrated in a multi-component reliability system subject to stochastic deterioration.

Original language	English
Number of pages	8
Publication status	Published - 2023
Event	14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023 - Trinity College Dublin, Dublin, Ireland Duration: 9 Jul 2023 → 13 Jul 2023 https://icasp14.com/

Conference

Conference	14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023
Abbreviated title	ICASP14
Country/Territory	Ireland
City	Dublin
Period	9/07/23 → 13/07/23
Internet address	https://icasp14.com/

Access to Document

Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learningFinal published version, 1.08 MBLicence: CC BY-NC-SA

http://hdl.handle.net/2262/103609

Cite this

@conference{21dc92ff01b249979405487cec69d4c0,

title = "Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learning",

abstract = "Maintenance planning of engineering systems is typically posed as a discrete stochastic optimal control problem, as it refers to determining a series of distinct interventions that upkeep structural integrity. Advanced algorithmic schemes within the joint framework of Partially Observable Markov Decision Processes (POMDPs) and multi-agent Deep Reinforcement Learning (DRL) have been recently able to approximate well global optima for this complex problem, outperforming existing time- and condition-based decision strategies. Integral to their success is the hypothesis that system components represent individual agents who form cooperative policies to minimize a central life-cycle cost. Thereby, the policy output scales linearly with the number of components, alleviating the curse of dimensionality related to combinatorial choices. State complexity and long-term optimality are handled efficiently via deep learning and POMDP principles, respectively. However, the efficiency of multi-agent coordination can fade as the number of agents increases. To this end, we propose a new formulation: we pose the problem as a continuous-control dynamic resource allocation one, combining hierarchical DRL and mixed-integer programming. Moving from flat decentralized to hierarchical multi-agent decompositions allows us to improve further the policy output scalability. The new Adaptive Knapsack Hierarchical Resource Allocator (AK-HRA) DRL architecture distributes available resources within the system, creating local, independently solvable, multi-choice knapsack optimization problems. By design, AK HRA allows decision-makers to inscribe known hierarchical structures and local decision rules in their architectures, thereby enhancing control and interpretability over the solution space. The efficacy of the new approach is demonstrated in a multi-component reliability system subject to stochastic deterioration.",

author = "Charalampos Andriotis and Ziead Metwally",

year = "2023",

language = "English",

note = "14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023, ICASP14 ; Conference date: 09-07-2023 Through 13-07-2023",

url = "https://icasp14.com/",

}

Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learning. / Andriotis, Charalampos ; Metwally, Ziead.
2023. Paper presented at 14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023, Dublin, Ireland.

Research output: Contribution to conference › Paper › peer-review

TY - CONF

T1 - Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learning

AU - Andriotis, Charalampos

AU - Metwally, Ziead

PY - 2023

Y1 - 2023

N2 - Maintenance planning of engineering systems is typically posed as a discrete stochastic optimal control problem, as it refers to determining a series of distinct interventions that upkeep structural integrity. Advanced algorithmic schemes within the joint framework of Partially Observable Markov Decision Processes (POMDPs) and multi-agent Deep Reinforcement Learning (DRL) have been recently able to approximate well global optima for this complex problem, outperforming existing time- and condition-based decision strategies. Integral to their success is the hypothesis that system components represent individual agents who form cooperative policies to minimize a central life-cycle cost. Thereby, the policy output scales linearly with the number of components, alleviating the curse of dimensionality related to combinatorial choices. State complexity and long-term optimality are handled efficiently via deep learning and POMDP principles, respectively. However, the efficiency of multi-agent coordination can fade as the number of agents increases. To this end, we propose a new formulation: we pose the problem as a continuous-control dynamic resource allocation one, combining hierarchical DRL and mixed-integer programming. Moving from flat decentralized to hierarchical multi-agent decompositions allows us to improve further the policy output scalability. The new Adaptive Knapsack Hierarchical Resource Allocator (AK-HRA) DRL architecture distributes available resources within the system, creating local, independently solvable, multi-choice knapsack optimization problems. By design, AK HRA allows decision-makers to inscribe known hierarchical structures and local decision rules in their architectures, thereby enhancing control and interpretability over the solution space. The efficacy of the new approach is demonstrated in a multi-component reliability system subject to stochastic deterioration.

AB - Maintenance planning of engineering systems is typically posed as a discrete stochastic optimal control problem, as it refers to determining a series of distinct interventions that upkeep structural integrity. Advanced algorithmic schemes within the joint framework of Partially Observable Markov Decision Processes (POMDPs) and multi-agent Deep Reinforcement Learning (DRL) have been recently able to approximate well global optima for this complex problem, outperforming existing time- and condition-based decision strategies. Integral to their success is the hypothesis that system components represent individual agents who form cooperative policies to minimize a central life-cycle cost. Thereby, the policy output scales linearly with the number of components, alleviating the curse of dimensionality related to combinatorial choices. State complexity and long-term optimality are handled efficiently via deep learning and POMDP principles, respectively. However, the efficiency of multi-agent coordination can fade as the number of agents increases. To this end, we propose a new formulation: we pose the problem as a continuous-control dynamic resource allocation one, combining hierarchical DRL and mixed-integer programming. Moving from flat decentralized to hierarchical multi-agent decompositions allows us to improve further the policy output scalability. The new Adaptive Knapsack Hierarchical Resource Allocator (AK-HRA) DRL architecture distributes available resources within the system, creating local, independently solvable, multi-choice knapsack optimization problems. By design, AK HRA allows decision-makers to inscribe known hierarchical structures and local decision rules in their architectures, thereby enhancing control and interpretability over the solution space. The efficacy of the new approach is demonstrated in a multi-component reliability system subject to stochastic deterioration.

M3 - Paper

T2 - 14th International Conference on Applications of Statistics and Probability in Civil Engineering 2023

Y2 - 9 July 2023 through 13 July 2023

ER -

Structural integrity management via hierarchical resource allocation and continuous-control reinforcement learning

Abstract

Conference

Access to Document

Fingerprint

Cite this