Abstract
Global synchromodal transportation involves the movement of container shipments between inland terminals located in different continents using ships, barges, trains, trucks, or any combination among them through integrated planning at a network level. One of the challenges faced by global operators is the matching of accepted shipments with services in an integrated global synchromodal transport network with dynamic and stochastic travel times. The travel times of services are unknown and revealed dynamically during the execution of transport plans, but the stochastic information of travel times are assumed available. Matching decisions can be updated before shipments arrive at their destination terminals. The objective of the problem is to maximize the total profits that are expressed in terms of a combination of revenues, travel costs, transfer costs, storage costs, delay costs, and carbon tax over a given planning horizon. We propose a sequential decision process model to describe the problem. In order to address the curse of dimensionality, we develop a reinforcement learning approach to learn the value of matching a shipment with a service through simulations. Specifically, we adopt the Q-learning algorithm to update value function estimations and use the ϵ-greedy strategy to balance exploitation and exploration. Online decisions are created based on the estimated value functions. The performance of the reinforcement learning approach is evaluated in comparison to a myopic approach that does not consider uncertainties and a stochastic approach that sets chance constraints on feasible transshipment under a rolling horizon framework.
Original language | English |
---|---|
Number of pages | 32 |
Journal | Annals of Operations Research |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-careOtherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
Keywords
- Dynamic and stochastic travel times
- Global synchromodal shipment matching
- Q-learning
- Reinforcement learning
- Sequential decision process