On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice: An Application of Machine Learning versus Path Sampling-Logit Model Framework

Ana Sobhani, S. Ali Haji Esmaeli, Ahmad Sobhani

Research output: Contribution to conferenceAbstractScientific


Cycling has been recognized as one of important travel modes in cities due to its societal and environmental benefits of alleviating traffic congestion, improving air quality, decreasing fuel consumption, increasing public health, and providing an affordable mode of transport. Although efforts and investments to increase bike use, the low growth rate in cycling has been a subject for investigation in the last decade. Researchers, policy makers and transportation agencies have invested extensive resources to identify factors that influence cycling as cycling mode share takes up only 1.3% of the commuters in Canada and 1% of all trips in the U.S. in 2001 (e.g., National Household Traffic Survey of America 2013). In order to understand the underlying reasons for such low cycling levels and to see a substantial increase in mode shift, an understanding of cyclist needs and perceptions is required. Discrete route choice modeling has been paid attention to in recent years to investigate travel behavior of cyclists. Route-based models such as C-Logit and Path-Size Logit (PSL) were widely used for bicycle route choice analysis (Ben-Akiva & Bierlaire 1999) to evaluate the effects of attributes related to the whole trip traces. These route choice models only addressed the similarities between the considered (sampled) set of routes. In order to consider sampled and non-sampled alternatives (routes), Expanded Path Size Logit (EPSL) model has been suggested that made use of a sampling approach to compare the chosen path to a set of alternative paths available to the cyclist. All above efforts lead to better understand traveler behavior in selecting choices from generated alternative choice sets. However, in the presence of intensive applications of discrete route modeling, limited studies have applied machine learning techniques to identify factors explaining travel decisions and to uncover the underlying decision-rules. Machine learning is the practice of bringing quantitative data, analyze and visualize them in ways to bear on decision making and predicting futures by finding patterns from existing data. Machine Learning (ML) techniques use different algorithms to extract knowledge/information from large datasets. Decision tree, and random forest are two popular, strong and non-parametric ML methods, which are able to predict future responses (predicting cyclist route choice) within a black box framework. As an advantage of ML analytics, the introduced methods are able to handle complex data collected from different resources such as videos, pictures, surveys, text in efficient ways. The application of ML in different transportation domains has become popular recently (Wong et al., 2017). However, the application of ML in travel behavior research field is still limited mostly to analyzing observed movement patterns and to make short-term travel demand predictions. As addressed by transportation scholars, it is essential to use ML methods as data oriented techniques in identifying factors justifying travel decisions. With respect to the above arguments, this paper has been motivated to analyze cyclist route choice by applying Expanded Path Size Logit (EPSL) model along with the Metropolis-Hastings (MH) sampling algorithm. The results are compared with findings achieved from ML techniques, which to best of our knowledge, has not been used together for cycling route choice analysis. Our study makes use of data from a large-scale GPS-based travel survey, as well as Toronto’s geographic information system (GIS) road network databases to model Torontonian’s bicycle route choice. The GPS bicycle trajectory data include valuable information such as trip purpose, date, time and season. Decision tree and random forest complete non-parametric analysis on data collected from GPS based travel survey to predict cyclist route choice according to given attributes. MH path-sampling algorithm is applied on a road network to generate the choice set, and a multivariate route choice framework while EPSL considers the effects of various attributes on cyclist route choices by including correlations between the sampled and non-sampled alternatives.
Original languageEnglish
Number of pages1
Publication statusPublished - 2018
EventIATBR 2018: 15th International Conference on Travel Behaviour Research - Santa Barbara, United States
Duration: 15 Jul 201820 Jul 2018
Conference number: 15


ConferenceIATBR 2018: 15th International Conference on Travel Behaviour Research
Abbreviated titleIATBR 2018
CountryUnited States
CitySanta Barbara
Internet address


Dive into the research topics of 'On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice: An Application of Machine Learning versus Path Sampling-Logit Model Framework'. Together they form a unique fingerprint.

Cite this