Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization

Lingbin Ning; Min Zhou; Zhuopu Hou; Rob M.P. Goverde; Fei Yue Wang; Hairong Dong

doi:10.1109/TITS.2021.3105380

Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization

Lingbin Ning, Min Zhou^*, Zhuopu Hou, Rob M.P. Goverde, Fei Yue Wang, Hairong Dong

^*Corresponding author for this work

Transport and Planning

Research output: Contribution to journal › Article › Scientific › peer-review

16 Citations (Scopus)

14 Downloads (Pure)

Abstract

This paper proposes a novel train trajectory optimization approach for high-speed railways. We restrict our attention to single train operation scenarios with different scheduled/rescheduled running times aiming at generating optimal train recommended trajectories in real time, which can ensure punctuality and energy efficiency of train operation. A learning-based approach deep deterministic policy gradient (DDPG) is designed to generate optimal train trajectories based on the offline training from the interaction between the agent and the trajectory simulation environment. An allocating running time and selecting operation modes (ARTSOM) algorithm is proposed to improve train punctuality and give a series of discrete operation modes (full traction, cruising, coasting, full braking), and thus to produce a feasible training set for DDPG, which can speed up the training process. Numerical experiments show that an optimized speed profile can be generated by DDPG within seconds on a realistic railway line. In addition, the results demonstrate the generalization ability of trained DDPG in solving TTO problems with different running times and line conditions.

Original language	English
Pages (from-to)	11562-11574
Number of pages	13
Journal	IEEE Transactions on Intelligent Transportation Systems
Volume	23
Issue number	8
DOIs	https://doi.org/10.1109/TITS.2021.3105380
Publication status	Published - 2022

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

deep deterministic policy gradient
energy efficiency
High-speed railway
train trajectory optimization

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/TITS.2021.3105380

Deep_Deterministic_Policy_Gradient_for_High-Speed_Train_Trajectory_OptimizationFinal published version, 3.59 MB

Cite this

@article{ead36177821d4233a01471f860d662fd,

title = "Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization",

abstract = "This paper proposes a novel train trajectory optimization approach for high-speed railways. We restrict our attention to single train operation scenarios with different scheduled/rescheduled running times aiming at generating optimal train recommended trajectories in real time, which can ensure punctuality and energy efficiency of train operation. A learning-based approach deep deterministic policy gradient (DDPG) is designed to generate optimal train trajectories based on the offline training from the interaction between the agent and the trajectory simulation environment. An allocating running time and selecting operation modes (ARTSOM) algorithm is proposed to improve train punctuality and give a series of discrete operation modes (full traction, cruising, coasting, full braking), and thus to produce a feasible training set for DDPG, which can speed up the training process. Numerical experiments show that an optimized speed profile can be generated by DDPG within seconds on a realistic railway line. In addition, the results demonstrate the generalization ability of trained DDPG in solving TTO problems with different running times and line conditions. ",

keywords = "deep deterministic policy gradient, energy efficiency, High-speed railway, train trajectory optimization",

author = "Lingbin Ning and Min Zhou and Zhuopu Hou and Goverde, {Rob M.P.} and Wang, {Fei Yue} and Hairong Dong",

note = "Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.",

year = "2022",

doi = "10.1109/TITS.2021.3105380",

language = "English",

volume = "23",

pages = "11562--11574",

journal = "IEEE Transactions on Intelligent Transportation Systems",

issn = "1524-9050",

publisher = "IEEE",

number = "8",

}

TY - JOUR

T1 - Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization

AU - Ning, Lingbin

AU - Zhou, Min

AU - Hou, Zhuopu

AU - Goverde, Rob M.P.

AU - Wang, Fei Yue

AU - Dong, Hairong

N1 - Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

PY - 2022

Y1 - 2022

N2 - This paper proposes a novel train trajectory optimization approach for high-speed railways. We restrict our attention to single train operation scenarios with different scheduled/rescheduled running times aiming at generating optimal train recommended trajectories in real time, which can ensure punctuality and energy efficiency of train operation. A learning-based approach deep deterministic policy gradient (DDPG) is designed to generate optimal train trajectories based on the offline training from the interaction between the agent and the trajectory simulation environment. An allocating running time and selecting operation modes (ARTSOM) algorithm is proposed to improve train punctuality and give a series of discrete operation modes (full traction, cruising, coasting, full braking), and thus to produce a feasible training set for DDPG, which can speed up the training process. Numerical experiments show that an optimized speed profile can be generated by DDPG within seconds on a realistic railway line. In addition, the results demonstrate the generalization ability of trained DDPG in solving TTO problems with different running times and line conditions.

AB - This paper proposes a novel train trajectory optimization approach for high-speed railways. We restrict our attention to single train operation scenarios with different scheduled/rescheduled running times aiming at generating optimal train recommended trajectories in real time, which can ensure punctuality and energy efficiency of train operation. A learning-based approach deep deterministic policy gradient (DDPG) is designed to generate optimal train trajectories based on the offline training from the interaction between the agent and the trajectory simulation environment. An allocating running time and selecting operation modes (ARTSOM) algorithm is proposed to improve train punctuality and give a series of discrete operation modes (full traction, cruising, coasting, full braking), and thus to produce a feasible training set for DDPG, which can speed up the training process. Numerical experiments show that an optimized speed profile can be generated by DDPG within seconds on a realistic railway line. In addition, the results demonstrate the generalization ability of trained DDPG in solving TTO problems with different running times and line conditions.

KW - deep deterministic policy gradient

KW - energy efficiency

KW - High-speed railway

KW - train trajectory optimization

UR - http://www.scopus.com/inward/record.url?scp=85113842981&partnerID=8YFLogxK

U2 - 10.1109/TITS.2021.3105380

DO - 10.1109/TITS.2021.3105380

M3 - Article

AN - SCOPUS:85113842981

SN - 1524-9050

VL - 23

SP - 11562

EP - 11574

JO - IEEE Transactions on Intelligent Transportation Systems

JF - IEEE Transactions on Intelligent Transportation Systems

IS - 8

ER -

Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization

Abstract

Bibliographical note

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this