Travel behavior analysis using Artificial Neural Networks: Striking the balance between model complexity and data requirements

Ahmad Alwosheel; Sander van Cranenburgh; Caspar Chorus

Abstract

Despite having been known for a long time (e.g., McCulloch & Pitts, 1943; Rosenblatt, 1958), and despite having been occasionally used for the analysis of travel behavior since more than a decade ago (Hensher & Ton, 2000; Mohammadian & Miller, 2002), Artificial Neural Networks (ANNs) have only lately become – by a distance – the most prominent and promising Artificial Intelligence (AI) model for the analysis of travel behavior in the context of large, emerging data sources (e.g. Karlaftis & Vlahogianni, 2011; Chen et al., 2016; van Cranenburgh & Alwosheel, 2017). This sharp increase in the popularity of ANNs as a tool for travel behavior analysis has resulted from a range of improvements in ANNs’ capabilities, increases in computational power, and the rapidly increasing size and diversity of data which are at the disposal of choice modelers. This paper aims to help pave the way for further and effective deployment of ANNs for travel behavior analysis. It does so by highlighting and articulating an easily overlooked aspect of the ANN-methodology, which is of crucial importance for its successful use in a travel choice modeling context. More specifically, we study the relation between i) the assumed characteristics of the Data Generating Process (DGP; in this case the assumed model of travel choice behavior or decision rule), and ii) the size of the data that is required for meaningful, reliable travel choice analysis using ANNs. The core idea behind this relation is intuitive: if the DGP is relatively complex – e.g. highly non-linear – then a given ANN needs more data to be able to generate a reliable representation of the DGP, leading to accurate predictions. Despite or perhaps because of this straightforward intuition, choice modelers employing ANNs so far seem to have ignored important results from the AI literature which rigorously define this relation between the complexity of the DGP and resulting data-requirements, in the context of empirical analysis using ANNs. Such concepts as the Universal Approximation Theorem (Cybenko, 1989; Hornik et al., 1989), the notion of Probably Approximately Correct (Valiant, 1984) and the so-called V-C dimension (Vapnik & Chervonenkis, 1971) have helped AI-researchers in various fields of application determine the required size of their dataset as a function of the assumed characteristics of the DGP. This paper aims to introduce these theoretical concepts and notions from the AI-literature to the travel behavior research community, and moreover to translate them in a way that they can be readily used by travel choice modelers. By doing so, we aim to help travel behavior researchers who wish to use ANNs for discrete choice analysis, in the process of selecting data sets or collecting data. To focus our attention, we limit our discussion to the context of two particular travel choice models as DGPs: one is the well-known linear-additive MNL model based on utility maximization premises, which is the workhorse of discrete choice analysis and in many ways the least complex choice model available (Ben-Akiva & Lerman, 1985; Train, 2009). The other is the Random Regret Minimization model (in MNL form), which is one the most used behavioral alternatives to the canonical linear in parameters utility based MNL model (Chorus et al., 2008; van Cranenburgh et al., 2015). The regret function embedded in most RRM models is highly non-linear and includes attributes of all alternatives in the choice set. As such it is a considerably more ‘complex’ choice model than its utility based counterpart, something which for example shows in considerably higher runtimes (Guevara et al., 2016). As such, the comparison between these two models (i.e., DGPs) serves well to highlight how, in the context of discrete choice analysis based on ANNs, data-requirements follow from the characteristics – i.e., level of complexity – of the DGP. Our study thus consists of two parts. In Part 1 we will introduce all relevant concepts, notions and theorems that have been developed in the ANN literature to determine minimum sample sizes as a function of model complexity. We will make sure to present these ideas in a notation and framework that connects directly with conventional modeling practice in the travel behavior research community. In Part 2 we will use these ideas in a concrete example, for illustration purposes and to establish face validity. More specifically, we will show how Random Utility and Random Regret DGPs differ in terms of their data requirements, in the context of model estimation with ANNs. We conclude our study with the derivation and discussion of implications for researchers and practitioners in the field of travel behavior analysis. To get a flavor of the analyses which we performed in Part 2, we here present some first results. Our ‘empirical’ setting is a simple travel mode choice between three alternatives (car, bus, train) based on two attributes (travel time, travel cost). We generate two synthetic datasets containing mode choices: one dataset uses a Random Utility DGP (in MNL-form) and the other one uses a Random Regret DGP (also in MNL-form). Subsequently, we derive – using the introduced concepts from the ANN-literature – the theoretically expected minimum (training) sample size needed to achieve a reliable representation of the DGP by an appropriately specified ANN. We do this for the RUM and RRM DGPs, and show how – in line with expectations – the theoretically required minimum (training) sample size is larger for the latter. Finally, we verify this theoretical result by training ANNs, for each DGP, using increasingly large subsets of the synthetic data. As Figure 1 (RUM) and Figure 2 (RRM) show, the out of sample predictive ability – measured in terms of out of sample LogLikelihood – of the corresponding ANNs is found to increase sharply up to the theoretically identified minimum (training) sample size, after which marginal increments in model fit become notably smaller. This suggests that the theoretically established minimum sample size provides a reasonable indication of practical (training) sample size requirements for the two different DGPs.

Original language	English
Number of pages	1
Publication status	Published - 2018
Event	IATBR 2018: 15th International Conference on Travel Behaviour Research - Santa Barbara, United States Duration: 15 Jul 2018 → 20 Jul 2018 Conference number: 15 http://www.iatbr2018.org/

Conference

Conference	IATBR 2018: 15th International Conference on Travel Behaviour Research
Abbreviated title	IATBR 2018
Country/Territory	United States
City	Santa Barbara
Period	15/07/18 → 20/07/18
Internet address	http://www.iatbr2018.org/

Cite this

@conference{ce4a47229fcd4be2bd3d129f1a4ad799,

title = "Travel behavior analysis using Artificial Neural Networks: Striking the balance between model complexity and data requirements",

abstract = "Despite having been known for a long time (e.g., McCulloch & Pitts, 1943; Rosenblatt, 1958), and despite having been occasionally used for the analysis of travel behavior since more than a decade ago (Hensher & Ton, 2000; Mohammadian & Miller, 2002), Artificial Neural Networks (ANNs) have only lately become – by a distance – the most prominent and promising Artificial Intelligence (AI) model for the analysis of travel behavior in the context of large, emerging data sources (e.g. Karlaftis & Vlahogianni, 2011; Chen et al., 2016; van Cranenburgh & Alwosheel, 2017). This sharp increase in the popularity of ANNs as a tool for travel behavior analysis has resulted from a range of improvements in ANNs{\textquoteright} capabilities, increases in computational power, and the rapidly increasing size and diversity of data which are at the disposal of choice modelers. This paper aims to help pave the way for further and effective deployment of ANNs for travel behavior analysis. It does so by highlighting and articulating an easily overlooked aspect of the ANN-methodology, which is of crucial importance for its successful use in a travel choice modeling context. More specifically, we study the relation between i) the assumed characteristics of the Data Generating Process (DGP; in this case the assumed model of travel choice behavior or decision rule), and ii) the size of the data that is required for meaningful, reliable travel choice analysis using ANNs. The core idea behind this relation is intuitive: if the DGP is relatively complex – e.g. highly non-linear – then a given ANN needs more data to be able to generate a reliable representation of the DGP, leading to accurate predictions. Despite or perhaps because of this straightforward intuition, choice modelers employing ANNs so far seem to have ignored important results from the AI literature which rigorously define this relation between the complexity of the DGP and resulting data-requirements, in the context of empirical analysis using ANNs. Such concepts as the Universal Approximation Theorem (Cybenko, 1989; Hornik et al., 1989), the notion of Probably Approximately Correct (Valiant, 1984) and the so-called V-C dimension (Vapnik & Chervonenkis, 1971) have helped AI-researchers in various fields of application determine the required size of their dataset as a function of the assumed characteristics of the DGP. This paper aims to introduce these theoretical concepts and notions from the AI-literature to the travel behavior research community, and moreover to translate them in a way that they can be readily used by travel choice modelers. By doing so, we aim to help travel behavior researchers who wish to use ANNs for discrete choice analysis, in the process of selecting data sets or collecting data. To focus our attention, we limit our discussion to the context of two particular travel choice models as DGPs: one is the well-known linear-additive MNL model based on utility maximization premises, which is the workhorse of discrete choice analysis and in many ways the least complex choice model available (Ben-Akiva & Lerman, 1985; Train, 2009). The other is the Random Regret Minimization model (in MNL form), which is one the most used behavioral alternatives to the canonical linear in parameters utility based MNL model (Chorus et al., 2008; van Cranenburgh et al., 2015). The regret function embedded in most RRM models is highly non-linear and includes attributes of all alternatives in the choice set. As such it is a considerably more {\textquoteleft}complex{\textquoteright} choice model than its utility based counterpart, something which for example shows in considerably higher runtimes (Guevara et al., 2016). As such, the comparison between these two models (i.e., DGPs) serves well to highlight how, in the context of discrete choice analysis based on ANNs, data-requirements follow from the characteristics – i.e., level of complexity – of the DGP. Our study thus consists of two parts. In Part 1 we will introduce all relevant concepts, notions and theorems that have been developed in the ANN literature to determine minimum sample sizes as a function of model complexity. We will make sure to present these ideas in a notation and framework that connects directly with conventional modeling practice in the travel behavior research community. In Part 2 we will use these ideas in a concrete example, for illustration purposes and to establish face validity. More specifically, we will show how Random Utility and Random Regret DGPs differ in terms of their data requirements, in the context of model estimation with ANNs. We conclude our study with the derivation and discussion of implications for researchers and practitioners in the field of travel behavior analysis. To get a flavor of the analyses which we performed in Part 2, we here present some first results. Our {\textquoteleft}empirical{\textquoteright} setting is a simple travel mode choice between three alternatives (car, bus, train) based on two attributes (travel time, travel cost). We generate two synthetic datasets containing mode choices: one dataset uses a Random Utility DGP (in MNL-form) and the other one uses a Random Regret DGP (also in MNL-form). Subsequently, we derive – using the introduced concepts from the ANN-literature – the theoretically expected minimum (training) sample size needed to achieve a reliable representation of the DGP by an appropriately specified ANN. We do this for the RUM and RRM DGPs, and show how – in line with expectations – the theoretically required minimum (training) sample size is larger for the latter. Finally, we verify this theoretical result by training ANNs, for each DGP, using increasingly large subsets of the synthetic data. As Figure 1 (RUM) and Figure 2 (RRM) show, the out of sample predictive ability – measured in terms of out of sample LogLikelihood – of the corresponding ANNs is found to increase sharply up to the theoretically identified minimum (training) sample size, after which marginal increments in model fit become notably smaller. This suggests that the theoretically established minimum sample size provides a reasonable indication of practical (training) sample size requirements for the two different DGPs.",

author = "Ahmad Alwosheel and {van Cranenburgh}, Sander and Caspar Chorus",

year = "2018",

language = "English",

note = "IATBR 2018: 15th International Conference on Travel Behaviour Research, IATBR 2018 ; Conference date: 15-07-2018 Through 20-07-2018",

url = "http://www.iatbr2018.org/",

}

Travel behavior analysis using Artificial Neural Networks: Striking the balance between model complexity and data requirements. / Alwosheel, Ahmad; van Cranenburgh, Sander ; Chorus, Caspar.
2018. Abstract from IATBR 2018: 15th International Conference on Travel Behaviour Research, Santa Barbara, United States.

Research output: Contribution to conference › Abstract › Scientific

TY - CONF

T1 - Travel behavior analysis using Artificial Neural Networks

T2 - IATBR 2018: 15th International Conference on Travel Behaviour Research

AU - Alwosheel, Ahmad

AU - van Cranenburgh, Sander

AU - Chorus, Caspar

N1 - Conference code: 15

PY - 2018

Y1 - 2018

N2 - Despite having been known for a long time (e.g., McCulloch & Pitts, 1943; Rosenblatt, 1958), and despite having been occasionally used for the analysis of travel behavior since more than a decade ago (Hensher & Ton, 2000; Mohammadian & Miller, 2002), Artificial Neural Networks (ANNs) have only lately become – by a distance – the most prominent and promising Artificial Intelligence (AI) model for the analysis of travel behavior in the context of large, emerging data sources (e.g. Karlaftis & Vlahogianni, 2011; Chen et al., 2016; van Cranenburgh & Alwosheel, 2017). This sharp increase in the popularity of ANNs as a tool for travel behavior analysis has resulted from a range of improvements in ANNs’ capabilities, increases in computational power, and the rapidly increasing size and diversity of data which are at the disposal of choice modelers. This paper aims to help pave the way for further and effective deployment of ANNs for travel behavior analysis. It does so by highlighting and articulating an easily overlooked aspect of the ANN-methodology, which is of crucial importance for its successful use in a travel choice modeling context. More specifically, we study the relation between i) the assumed characteristics of the Data Generating Process (DGP; in this case the assumed model of travel choice behavior or decision rule), and ii) the size of the data that is required for meaningful, reliable travel choice analysis using ANNs. The core idea behind this relation is intuitive: if the DGP is relatively complex – e.g. highly non-linear – then a given ANN needs more data to be able to generate a reliable representation of the DGP, leading to accurate predictions. Despite or perhaps because of this straightforward intuition, choice modelers employing ANNs so far seem to have ignored important results from the AI literature which rigorously define this relation between the complexity of the DGP and resulting data-requirements, in the context of empirical analysis using ANNs. Such concepts as the Universal Approximation Theorem (Cybenko, 1989; Hornik et al., 1989), the notion of Probably Approximately Correct (Valiant, 1984) and the so-called V-C dimension (Vapnik & Chervonenkis, 1971) have helped AI-researchers in various fields of application determine the required size of their dataset as a function of the assumed characteristics of the DGP. This paper aims to introduce these theoretical concepts and notions from the AI-literature to the travel behavior research community, and moreover to translate them in a way that they can be readily used by travel choice modelers. By doing so, we aim to help travel behavior researchers who wish to use ANNs for discrete choice analysis, in the process of selecting data sets or collecting data. To focus our attention, we limit our discussion to the context of two particular travel choice models as DGPs: one is the well-known linear-additive MNL model based on utility maximization premises, which is the workhorse of discrete choice analysis and in many ways the least complex choice model available (Ben-Akiva & Lerman, 1985; Train, 2009). The other is the Random Regret Minimization model (in MNL form), which is one the most used behavioral alternatives to the canonical linear in parameters utility based MNL model (Chorus et al., 2008; van Cranenburgh et al., 2015). The regret function embedded in most RRM models is highly non-linear and includes attributes of all alternatives in the choice set. As such it is a considerably more ‘complex’ choice model than its utility based counterpart, something which for example shows in considerably higher runtimes (Guevara et al., 2016). As such, the comparison between these two models (i.e., DGPs) serves well to highlight how, in the context of discrete choice analysis based on ANNs, data-requirements follow from the characteristics – i.e., level of complexity – of the DGP. Our study thus consists of two parts. In Part 1 we will introduce all relevant concepts, notions and theorems that have been developed in the ANN literature to determine minimum sample sizes as a function of model complexity. We will make sure to present these ideas in a notation and framework that connects directly with conventional modeling practice in the travel behavior research community. In Part 2 we will use these ideas in a concrete example, for illustration purposes and to establish face validity. More specifically, we will show how Random Utility and Random Regret DGPs differ in terms of their data requirements, in the context of model estimation with ANNs. We conclude our study with the derivation and discussion of implications for researchers and practitioners in the field of travel behavior analysis. To get a flavor of the analyses which we performed in Part 2, we here present some first results. Our ‘empirical’ setting is a simple travel mode choice between three alternatives (car, bus, train) based on two attributes (travel time, travel cost). We generate two synthetic datasets containing mode choices: one dataset uses a Random Utility DGP (in MNL-form) and the other one uses a Random Regret DGP (also in MNL-form). Subsequently, we derive – using the introduced concepts from the ANN-literature – the theoretically expected minimum (training) sample size needed to achieve a reliable representation of the DGP by an appropriately specified ANN. We do this for the RUM and RRM DGPs, and show how – in line with expectations – the theoretically required minimum (training) sample size is larger for the latter. Finally, we verify this theoretical result by training ANNs, for each DGP, using increasingly large subsets of the synthetic data. As Figure 1 (RUM) and Figure 2 (RRM) show, the out of sample predictive ability – measured in terms of out of sample LogLikelihood – of the corresponding ANNs is found to increase sharply up to the theoretically identified minimum (training) sample size, after which marginal increments in model fit become notably smaller. This suggests that the theoretically established minimum sample size provides a reasonable indication of practical (training) sample size requirements for the two different DGPs.

AB - Despite having been known for a long time (e.g., McCulloch & Pitts, 1943; Rosenblatt, 1958), and despite having been occasionally used for the analysis of travel behavior since more than a decade ago (Hensher & Ton, 2000; Mohammadian & Miller, 2002), Artificial Neural Networks (ANNs) have only lately become – by a distance – the most prominent and promising Artificial Intelligence (AI) model for the analysis of travel behavior in the context of large, emerging data sources (e.g. Karlaftis & Vlahogianni, 2011; Chen et al., 2016; van Cranenburgh & Alwosheel, 2017). This sharp increase in the popularity of ANNs as a tool for travel behavior analysis has resulted from a range of improvements in ANNs’ capabilities, increases in computational power, and the rapidly increasing size and diversity of data which are at the disposal of choice modelers. This paper aims to help pave the way for further and effective deployment of ANNs for travel behavior analysis. It does so by highlighting and articulating an easily overlooked aspect of the ANN-methodology, which is of crucial importance for its successful use in a travel choice modeling context. More specifically, we study the relation between i) the assumed characteristics of the Data Generating Process (DGP; in this case the assumed model of travel choice behavior or decision rule), and ii) the size of the data that is required for meaningful, reliable travel choice analysis using ANNs. The core idea behind this relation is intuitive: if the DGP is relatively complex – e.g. highly non-linear – then a given ANN needs more data to be able to generate a reliable representation of the DGP, leading to accurate predictions. Despite or perhaps because of this straightforward intuition, choice modelers employing ANNs so far seem to have ignored important results from the AI literature which rigorously define this relation between the complexity of the DGP and resulting data-requirements, in the context of empirical analysis using ANNs. Such concepts as the Universal Approximation Theorem (Cybenko, 1989; Hornik et al., 1989), the notion of Probably Approximately Correct (Valiant, 1984) and the so-called V-C dimension (Vapnik & Chervonenkis, 1971) have helped AI-researchers in various fields of application determine the required size of their dataset as a function of the assumed characteristics of the DGP. This paper aims to introduce these theoretical concepts and notions from the AI-literature to the travel behavior research community, and moreover to translate them in a way that they can be readily used by travel choice modelers. By doing so, we aim to help travel behavior researchers who wish to use ANNs for discrete choice analysis, in the process of selecting data sets or collecting data. To focus our attention, we limit our discussion to the context of two particular travel choice models as DGPs: one is the well-known linear-additive MNL model based on utility maximization premises, which is the workhorse of discrete choice analysis and in many ways the least complex choice model available (Ben-Akiva & Lerman, 1985; Train, 2009). The other is the Random Regret Minimization model (in MNL form), which is one the most used behavioral alternatives to the canonical linear in parameters utility based MNL model (Chorus et al., 2008; van Cranenburgh et al., 2015). The regret function embedded in most RRM models is highly non-linear and includes attributes of all alternatives in the choice set. As such it is a considerably more ‘complex’ choice model than its utility based counterpart, something which for example shows in considerably higher runtimes (Guevara et al., 2016). As such, the comparison between these two models (i.e., DGPs) serves well to highlight how, in the context of discrete choice analysis based on ANNs, data-requirements follow from the characteristics – i.e., level of complexity – of the DGP. Our study thus consists of two parts. In Part 1 we will introduce all relevant concepts, notions and theorems that have been developed in the ANN literature to determine minimum sample sizes as a function of model complexity. We will make sure to present these ideas in a notation and framework that connects directly with conventional modeling practice in the travel behavior research community. In Part 2 we will use these ideas in a concrete example, for illustration purposes and to establish face validity. More specifically, we will show how Random Utility and Random Regret DGPs differ in terms of their data requirements, in the context of model estimation with ANNs. We conclude our study with the derivation and discussion of implications for researchers and practitioners in the field of travel behavior analysis. To get a flavor of the analyses which we performed in Part 2, we here present some first results. Our ‘empirical’ setting is a simple travel mode choice between three alternatives (car, bus, train) based on two attributes (travel time, travel cost). We generate two synthetic datasets containing mode choices: one dataset uses a Random Utility DGP (in MNL-form) and the other one uses a Random Regret DGP (also in MNL-form). Subsequently, we derive – using the introduced concepts from the ANN-literature – the theoretically expected minimum (training) sample size needed to achieve a reliable representation of the DGP by an appropriately specified ANN. We do this for the RUM and RRM DGPs, and show how – in line with expectations – the theoretically required minimum (training) sample size is larger for the latter. Finally, we verify this theoretical result by training ANNs, for each DGP, using increasingly large subsets of the synthetic data. As Figure 1 (RUM) and Figure 2 (RRM) show, the out of sample predictive ability – measured in terms of out of sample LogLikelihood – of the corresponding ANNs is found to increase sharply up to the theoretically identified minimum (training) sample size, after which marginal increments in model fit become notably smaller. This suggests that the theoretically established minimum sample size provides a reasonable indication of practical (training) sample size requirements for the two different DGPs.

UR - https://easychair.org/smart-program/SantaBarbara2018-IATBR2018/2018-07-16.html#talk:74617

M3 - Abstract

Y2 - 15 July 2018 through 20 July 2018

ER -

Travel behavior analysis using Artificial Neural Networks: Striking the balance between model complexity and data requirements

Abstract

Conference

Other files and links

Fingerprint

Cite this