On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice: An Application of Machine Learning versus Path Sampling-Logit Model Framework

Ana Sobhani; S. Ali Haji Esmaeli; Ahmad Sobhani

Abstract

Cycling has been recognized as one of important travel modes in cities due to its societal and environmental benefits of alleviating traffic congestion, improving air quality, decreasing fuel consumption, increasing public health, and providing an affordable mode of transport. Although efforts and investments to increase bike use, the low growth rate in cycling has been a subject for investigation in the last decade. Researchers, policy makers and transportation agencies have invested extensive resources to identify factors that influence cycling as cycling mode share takes up only 1.3% of the commuters in Canada and 1% of all trips in the U.S. in 2001 (e.g., National Household Traffic Survey of America 2013). In order to understand the underlying reasons for such low cycling levels and to see a substantial increase in mode shift, an understanding of cyclist needs and perceptions is required. Discrete route choice modeling has been paid attention to in recent years to investigate travel behavior of cyclists. Route-based models such as C-Logit and Path-Size Logit (PSL) were widely used for bicycle route choice analysis (Ben-Akiva & Bierlaire 1999) to evaluate the effects of attributes related to the whole trip traces. These route choice models only addressed the similarities between the considered (sampled) set of routes. In order to consider sampled and non-sampled alternatives (routes), Expanded Path Size Logit (EPSL) model has been suggested that made use of a sampling approach to compare the chosen path to a set of alternative paths available to the cyclist. All above efforts lead to better understand traveler behavior in selecting choices from generated alternative choice sets. However, in the presence of intensive applications of discrete route modeling, limited studies have applied machine learning techniques to identify factors explaining travel decisions and to uncover the underlying decision-rules. Machine learning is the practice of bringing quantitative data, analyze and visualize them in ways to bear on decision making and predicting futures by finding patterns from existing data. Machine Learning (ML) techniques use different algorithms to extract knowledge/information from large datasets. Decision tree, and random forest are two popular, strong and non-parametric ML methods, which are able to predict future responses (predicting cyclist route choice) within a black box framework. As an advantage of ML analytics, the introduced methods are able to handle complex data collected from different resources such as videos, pictures, surveys, text in efficient ways. The application of ML in different transportation domains has become popular recently (Wong et al., 2017). However, the application of ML in travel behavior research field is still limited mostly to analyzing observed movement patterns and to make short-term travel demand predictions. As addressed by transportation scholars, it is essential to use ML methods as data oriented techniques in identifying factors justifying travel decisions. With respect to the above arguments, this paper has been motivated to analyze cyclist route choice by applying Expanded Path Size Logit (EPSL) model along with the Metropolis-Hastings (MH) sampling algorithm. The results are compared with findings achieved from ML techniques, which to best of our knowledge, has not been used together for cycling route choice analysis. Our study makes use of data from a large-scale GPS-based travel survey, as well as Toronto’s geographic information system (GIS) road network databases to model Torontonian’s bicycle route choice. The GPS bicycle trajectory data include valuable information such as trip purpose, date, time and season. Decision tree and random forest complete non-parametric analysis on data collected from GPS based travel survey to predict cyclist route choice according to given attributes. MH path-sampling algorithm is applied on a road network to generate the choice set, and a multivariate route choice framework while EPSL considers the effects of various attributes on cyclist route choices by including correlations between the sampled and non-sampled alternatives.

Original language	English
Number of pages	1
Publication status	Published - 2018
Event	IATBR 2018: 15th International Conference on Travel Behaviour Research - Santa Barbara, United States Duration: 15 Jul 2018 → 20 Jul 2018 Conference number: 15 http://www.iatbr2018.org/

Conference

Conference	IATBR 2018: 15th International Conference on Travel Behaviour Research
Abbreviated title	IATBR 2018
Country/Territory	United States
City	Santa Barbara
Period	15/07/18 → 20/07/18
Internet address	http://www.iatbr2018.org/

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Cite this

@conference{bc5aee2d6c14436bb64f4362613b8f9c,

title = "On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice: An Application of Machine Learning versus Path Sampling-Logit Model Framework",

abstract = "Cycling has been recognized as one of important travel modes in cities due to its societal and environmental benefits of alleviating traffic congestion, improving air quality, decreasing fuel consumption, increasing public health, and providing an affordable mode of transport. Although efforts and investments to increase bike use, the low growth rate in cycling has been a subject for investigation in the last decade. Researchers, policy makers and transportation agencies have invested extensive resources to identify factors that influence cycling as cycling mode share takes up only 1.3% of the commuters in Canada and 1% of all trips in the U.S. in 2001 (e.g., National Household Traffic Survey of America 2013). In order to understand the underlying reasons for such low cycling levels and to see a substantial increase in mode shift, an understanding of cyclist needs and perceptions is required. Discrete route choice modeling has been paid attention to in recent years to investigate travel behavior of cyclists. Route-based models such as C-Logit and Path-Size Logit (PSL) were widely used for bicycle route choice analysis (Ben-Akiva & Bierlaire 1999) to evaluate the effects of attributes related to the whole trip traces. These route choice models only addressed the similarities between the considered (sampled) set of routes. In order to consider sampled and non-sampled alternatives (routes), Expanded Path Size Logit (EPSL) model has been suggested that made use of a sampling approach to compare the chosen path to a set of alternative paths available to the cyclist. All above efforts lead to better understand traveler behavior in selecting choices from generated alternative choice sets. However, in the presence of intensive applications of discrete route modeling, limited studies have applied machine learning techniques to identify factors explaining travel decisions and to uncover the underlying decision-rules. Machine learning is the practice of bringing quantitative data, analyze and visualize them in ways to bear on decision making and predicting futures by finding patterns from existing data. Machine Learning (ML) techniques use different algorithms to extract knowledge/information from large datasets. Decision tree, and random forest are two popular, strong and non-parametric ML methods, which are able to predict future responses (predicting cyclist route choice) within a black box framework. As an advantage of ML analytics, the introduced methods are able to handle complex data collected from different resources such as videos, pictures, surveys, text in efficient ways. The application of ML in different transportation domains has become popular recently (Wong et al., 2017). However, the application of ML in travel behavior research field is still limited mostly to analyzing observed movement patterns and to make short-term travel demand predictions. As addressed by transportation scholars, it is essential to use ML methods as data oriented techniques in identifying factors justifying travel decisions. With respect to the above arguments, this paper has been motivated to analyze cyclist route choice by applying Expanded Path Size Logit (EPSL) model along with the Metropolis-Hastings (MH) sampling algorithm. The results are compared with findings achieved from ML techniques, which to best of our knowledge, has not been used together for cycling route choice analysis. Our study makes use of data from a large-scale GPS-based travel survey, as well as Toronto{\textquoteright}s geographic information system (GIS) road network databases to model Torontonian{\textquoteright}s bicycle route choice. The GPS bicycle trajectory data include valuable information such as trip purpose, date, time and season. Decision tree and random forest complete non-parametric analysis on data collected from GPS based travel survey to predict cyclist route choice according to given attributes. MH path-sampling algorithm is applied on a road network to generate the choice set, and a multivariate route choice framework while EPSL considers the effects of various attributes on cyclist route choices by including correlations between the sampled and non-sampled alternatives.",

author = "Ana Sobhani and {Haji Esmaeli}, {S. Ali} and Ahmad Sobhani",

year = "2018",

language = "English",

note = "IATBR 2018: 15th International Conference on Travel Behaviour Research, IATBR 2018 ; Conference date: 15-07-2018 Through 20-07-2018",

url = "http://www.iatbr2018.org/",

}

On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice: An Application of Machine Learning versus Path Sampling-Logit Model Framework. / Sobhani, Ana; Haji Esmaeli, S. Ali; Sobhani, Ahmad.
2018. Abstract from IATBR 2018: 15th International Conference on Travel Behaviour Research, Santa Barbara, United States.

Research output: Contribution to conference › Abstract › Scientific

TY - CONF

T1 - On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice

T2 - IATBR 2018: 15th International Conference on Travel Behaviour Research

AU - Sobhani, Ana

AU - Haji Esmaeli, S. Ali

AU - Sobhani, Ahmad

N1 - Conference code: 15

PY - 2018

Y1 - 2018

N2 - Cycling has been recognized as one of important travel modes in cities due to its societal and environmental benefits of alleviating traffic congestion, improving air quality, decreasing fuel consumption, increasing public health, and providing an affordable mode of transport. Although efforts and investments to increase bike use, the low growth rate in cycling has been a subject for investigation in the last decade. Researchers, policy makers and transportation agencies have invested extensive resources to identify factors that influence cycling as cycling mode share takes up only 1.3% of the commuters in Canada and 1% of all trips in the U.S. in 2001 (e.g., National Household Traffic Survey of America 2013). In order to understand the underlying reasons for such low cycling levels and to see a substantial increase in mode shift, an understanding of cyclist needs and perceptions is required. Discrete route choice modeling has been paid attention to in recent years to investigate travel behavior of cyclists. Route-based models such as C-Logit and Path-Size Logit (PSL) were widely used for bicycle route choice analysis (Ben-Akiva & Bierlaire 1999) to evaluate the effects of attributes related to the whole trip traces. These route choice models only addressed the similarities between the considered (sampled) set of routes. In order to consider sampled and non-sampled alternatives (routes), Expanded Path Size Logit (EPSL) model has been suggested that made use of a sampling approach to compare the chosen path to a set of alternative paths available to the cyclist. All above efforts lead to better understand traveler behavior in selecting choices from generated alternative choice sets. However, in the presence of intensive applications of discrete route modeling, limited studies have applied machine learning techniques to identify factors explaining travel decisions and to uncover the underlying decision-rules. Machine learning is the practice of bringing quantitative data, analyze and visualize them in ways to bear on decision making and predicting futures by finding patterns from existing data. Machine Learning (ML) techniques use different algorithms to extract knowledge/information from large datasets. Decision tree, and random forest are two popular, strong and non-parametric ML methods, which are able to predict future responses (predicting cyclist route choice) within a black box framework. As an advantage of ML analytics, the introduced methods are able to handle complex data collected from different resources such as videos, pictures, surveys, text in efficient ways. The application of ML in different transportation domains has become popular recently (Wong et al., 2017). However, the application of ML in travel behavior research field is still limited mostly to analyzing observed movement patterns and to make short-term travel demand predictions. As addressed by transportation scholars, it is essential to use ML methods as data oriented techniques in identifying factors justifying travel decisions. With respect to the above arguments, this paper has been motivated to analyze cyclist route choice by applying Expanded Path Size Logit (EPSL) model along with the Metropolis-Hastings (MH) sampling algorithm. The results are compared with findings achieved from ML techniques, which to best of our knowledge, has not been used together for cycling route choice analysis. Our study makes use of data from a large-scale GPS-based travel survey, as well as Toronto’s geographic information system (GIS) road network databases to model Torontonian’s bicycle route choice. The GPS bicycle trajectory data include valuable information such as trip purpose, date, time and season. Decision tree and random forest complete non-parametric analysis on data collected from GPS based travel survey to predict cyclist route choice according to given attributes. MH path-sampling algorithm is applied on a road network to generate the choice set, and a multivariate route choice framework while EPSL considers the effects of various attributes on cyclist route choices by including correlations between the sampled and non-sampled alternatives.

AB - Cycling has been recognized as one of important travel modes in cities due to its societal and environmental benefits of alleviating traffic congestion, improving air quality, decreasing fuel consumption, increasing public health, and providing an affordable mode of transport. Although efforts and investments to increase bike use, the low growth rate in cycling has been a subject for investigation in the last decade. Researchers, policy makers and transportation agencies have invested extensive resources to identify factors that influence cycling as cycling mode share takes up only 1.3% of the commuters in Canada and 1% of all trips in the U.S. in 2001 (e.g., National Household Traffic Survey of America 2013). In order to understand the underlying reasons for such low cycling levels and to see a substantial increase in mode shift, an understanding of cyclist needs and perceptions is required. Discrete route choice modeling has been paid attention to in recent years to investigate travel behavior of cyclists. Route-based models such as C-Logit and Path-Size Logit (PSL) were widely used for bicycle route choice analysis (Ben-Akiva & Bierlaire 1999) to evaluate the effects of attributes related to the whole trip traces. These route choice models only addressed the similarities between the considered (sampled) set of routes. In order to consider sampled and non-sampled alternatives (routes), Expanded Path Size Logit (EPSL) model has been suggested that made use of a sampling approach to compare the chosen path to a set of alternative paths available to the cyclist. All above efforts lead to better understand traveler behavior in selecting choices from generated alternative choice sets. However, in the presence of intensive applications of discrete route modeling, limited studies have applied machine learning techniques to identify factors explaining travel decisions and to uncover the underlying decision-rules. Machine learning is the practice of bringing quantitative data, analyze and visualize them in ways to bear on decision making and predicting futures by finding patterns from existing data. Machine Learning (ML) techniques use different algorithms to extract knowledge/information from large datasets. Decision tree, and random forest are two popular, strong and non-parametric ML methods, which are able to predict future responses (predicting cyclist route choice) within a black box framework. As an advantage of ML analytics, the introduced methods are able to handle complex data collected from different resources such as videos, pictures, surveys, text in efficient ways. The application of ML in different transportation domains has become popular recently (Wong et al., 2017). However, the application of ML in travel behavior research field is still limited mostly to analyzing observed movement patterns and to make short-term travel demand predictions. As addressed by transportation scholars, it is essential to use ML methods as data oriented techniques in identifying factors justifying travel decisions. With respect to the above arguments, this paper has been motivated to analyze cyclist route choice by applying Expanded Path Size Logit (EPSL) model along with the Metropolis-Hastings (MH) sampling algorithm. The results are compared with findings achieved from ML techniques, which to best of our knowledge, has not been used together for cycling route choice analysis. Our study makes use of data from a large-scale GPS-based travel survey, as well as Toronto’s geographic information system (GIS) road network databases to model Torontonian’s bicycle route choice. The GPS bicycle trajectory data include valuable information such as trip purpose, date, time and season. Decision tree and random forest complete non-parametric analysis on data collected from GPS based travel survey to predict cyclist route choice according to given attributes. MH path-sampling algorithm is applied on a road network to generate the choice set, and a multivariate route choice framework while EPSL considers the effects of various attributes on cyclist route choices by including correlations between the sampled and non-sampled alternatives.

UR - https://easychair.org/smart-program/SantaBarbara2018-IATBR2018/2018-07-16.html#talk:74617

M3 - Abstract

Y2 - 15 July 2018 through 20 July 2018

ER -

On the Use of Machine Learning Approaches for Implicit Modeling of Cycling Route Choice: An Application of Machine Learning versus Path Sampling-Logit Model Framework

Abstract

Conference

UN SDGs

Other files and links

Fingerprint

Cite this