Reinforcement learning for train timetable rescheduling under perturbation: A general value-based approach

Pu Zhang, Lingyun Meng*, Yongqiu Zhu, Jianrui Miao, Xiaojie Luan, Zhengwen Liao

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

Abstract

This paper proposes a value-based deep reinforcement learning approach that is capable of handling train timetable rescheduling under both disturbed and disrupted situations. A railway environment is constructed to simulate the problem as a Markov decision process, where the optimization objective is integrated into the reward module and various constraints are incorporated into the conflict detection and avoidance module. To address the challenges of sparse rewards and large action space with limited legal actions, a value-based algorithm framework is proposed to efficiently select and effectively evaluate actions. Through the designed simulation and training procedures, the proposed approach is tested on several disturbance and disruption cases based on a real-world instance (i.e. a Chinese high-speed railway corridor). Experimental results show that the proposed method can obtain high-quality solutions within a reasonable computing time, and also outperforms handcrafted rules in terms of the optimality of solutions. Furthermore, the proposed method exhibits promising generalization capabilities in homogeneous perturbation scenarios (disturbance scenarios and disruption scenarios that share either the same affected location and start time or the same affected location and disrupted duration).

Original languageEnglish
Article number111867
Number of pages24
JournalComputers and Industrial Engineering
Volume214
DOIs
Publication statusPublished - 2026

Keywords

  • High-speed railway
  • Real-timerailway traffic management
  • Reinforcement learning
  • Train timetable rescheduling

Fingerprint

Dive into the research topics of 'Reinforcement learning for train timetable rescheduling under perturbation: A general value-based approach'. Together they form a unique fingerprint.

Cite this