A comprehensive joint econometric model of motor vehicle crashes arising from multiple sources of risk

Amir Pooyan Afghari; Simon Washington; Md Mazharul Haque; Zili Li

doi:10.1016/j.amar.2018.03.002

A comprehensive joint econometric model of motor vehicle crashes arising from multiple sources of risk

Amir Pooyan Afghari^*, Simon Washington, Md Mazharul Haque, Zili Li

^*Corresponding author for this work

Research output: Contribution to journal › Article › Scientific › peer-review

33 Citations (Scopus)

Abstract

In the safety literature, motor vehicle crashes are modelled predominately using single equation regression models, albeit with a variety of distributional assumptions and econometric enhancements. These models rely on a single linear additive predictive equation—which becomes multiplicative with a log transform—to specify the expected mean crash count conditioned on predictors. The models also specify the distribution of observations around the conditional mean, with common examples including the Poisson, Negative Binomial, and Conway-Maxwell distribution among others. This mainstream probabilistic conceptualization (i.e. model) of motor vehicle crash causation assumes that crashes are well-approximated by a single source of risk, wherein several contributing factors exert their collective, non-independent influences on the occurrence of crashes via a linear predictor. This study first postulates, and then demonstrates empirically, that crash occurrence may be more complex than can be adequately captured by a single equation regression model. The total crash count recorded at a transport network location (e.g. road segment) may arise from multiple simultaneous and inter-dependent sources of risk, rather than one. Each of these sources may uniquely contribute to the total observed crash count. For instance, a site's crash occurrence may be dominated by contributions from driver behaviour issues (e.g. speeding, impaired driving), while another site's crashes might arise predominately from design and operational deficiencies such as deteriorating pavements and worn lane markings. Stated succinctly, this research hypothesises that the unobserved heterogeneity in the accumulation of motor vehicle crashes at transport network locations arises because multiple sources of risk, not one, better captures complexity in the crash occurrence process. A stochastic multiple risk source methodological approach is developed to correspond with and empirically test this hypothesis. A joint econometric model with random parameters and instrumental variables demonstrates the applicability of the proposed theory and the corresponding methodological approach. The proposed model assumes that complexity of crash occurrence is well approximated using three sources of risk comprised of engineering, unobserved spatial, and driver behavioural factors. It is empirically tested using crash data from state controlled roads in Queensland, Australia. Finally, the multiple risk source model is compared to the traditional single risk source model to assess the viability of the proposed approach based on the sample data. The multiple risk source model significantly outperformed the single risk source model in terms of prediction ability and goodness of fit measures. In addition, while the single risk source model predicts total crash counts for individual sites, the multiple risk source model predicts crash count proportions contributed by each source of risk, and predicts crashes by risk source. The improvement in fit combined with the theoretical appeal of a multiple risk source model to explain unobserved heterogeneity in crashes suggests—at least for the sample used in the study—that the complexity in crash occurrence is better explained using multiple equation linear predictors. Further research should examine other datasets for repeatability and should further explore and test risk sources.

Original language	English
Pages (from-to)	1-14
Number of pages	14
Journal	Analytic Methods in Accident Research
Volume	18
DOIs	https://doi.org/10.1016/j.amar.2018.03.002
Publication status	Published - Jun 2018
Externally published	Yes

Keywords

Crash causation mechanism
Data generating process
Instrumental variable
Joint model
Random parameters model
Structural equation model

Access to Document

10.1016/j.amar.2018.03.002

Cite this

@article{26c3e0c3f53d4a74aa2fe171737c8580,

title = "A comprehensive joint econometric model of motor vehicle crashes arising from multiple sources of risk",

abstract = "In the safety literature, motor vehicle crashes are modelled predominately using single equation regression models, albeit with a variety of distributional assumptions and econometric enhancements. These models rely on a single linear additive predictive equation—which becomes multiplicative with a log transform—to specify the expected mean crash count conditioned on predictors. The models also specify the distribution of observations around the conditional mean, with common examples including the Poisson, Negative Binomial, and Conway-Maxwell distribution among others. This mainstream probabilistic conceptualization (i.e. model) of motor vehicle crash causation assumes that crashes are well-approximated by a single source of risk, wherein several contributing factors exert their collective, non-independent influences on the occurrence of crashes via a linear predictor. This study first postulates, and then demonstrates empirically, that crash occurrence may be more complex than can be adequately captured by a single equation regression model. The total crash count recorded at a transport network location (e.g. road segment) may arise from multiple simultaneous and inter-dependent sources of risk, rather than one. Each of these sources may uniquely contribute to the total observed crash count. For instance, a site's crash occurrence may be dominated by contributions from driver behaviour issues (e.g. speeding, impaired driving), while another site's crashes might arise predominately from design and operational deficiencies such as deteriorating pavements and worn lane markings. Stated succinctly, this research hypothesises that the unobserved heterogeneity in the accumulation of motor vehicle crashes at transport network locations arises because multiple sources of risk, not one, better captures complexity in the crash occurrence process. A stochastic multiple risk source methodological approach is developed to correspond with and empirically test this hypothesis. A joint econometric model with random parameters and instrumental variables demonstrates the applicability of the proposed theory and the corresponding methodological approach. The proposed model assumes that complexity of crash occurrence is well approximated using three sources of risk comprised of engineering, unobserved spatial, and driver behavioural factors. It is empirically tested using crash data from state controlled roads in Queensland, Australia. Finally, the multiple risk source model is compared to the traditional single risk source model to assess the viability of the proposed approach based on the sample data. The multiple risk source model significantly outperformed the single risk source model in terms of prediction ability and goodness of fit measures. In addition, while the single risk source model predicts total crash counts for individual sites, the multiple risk source model predicts crash count proportions contributed by each source of risk, and predicts crashes by risk source. The improvement in fit combined with the theoretical appeal of a multiple risk source model to explain unobserved heterogeneity in crashes suggests—at least for the sample used in the study—that the complexity in crash occurrence is better explained using multiple equation linear predictors. Further research should examine other datasets for repeatability and should further explore and test risk sources.",

keywords = "Crash causation mechanism, Data generating process, Instrumental variable, Joint model, Random parameters model, Structural equation model",

author = "Afghari, {Amir Pooyan} and Simon Washington and Haque, {Md Mazharul} and Zili Li",

year = "2018",

month = jun,

doi = "10.1016/j.amar.2018.03.002",

language = "English",

volume = "18",

pages = "1--14",

journal = "Analytic Methods in Accident Research",

issn = "2213-6657",

publisher = "Elsevier",

}

TY - JOUR

T1 - A comprehensive joint econometric model of motor vehicle crashes arising from multiple sources of risk

AU - Afghari, Amir Pooyan

AU - Washington, Simon

AU - Haque, Md Mazharul

AU - Li, Zili

PY - 2018/6

Y1 - 2018/6

N2 - In the safety literature, motor vehicle crashes are modelled predominately using single equation regression models, albeit with a variety of distributional assumptions and econometric enhancements. These models rely on a single linear additive predictive equation—which becomes multiplicative with a log transform—to specify the expected mean crash count conditioned on predictors. The models also specify the distribution of observations around the conditional mean, with common examples including the Poisson, Negative Binomial, and Conway-Maxwell distribution among others. This mainstream probabilistic conceptualization (i.e. model) of motor vehicle crash causation assumes that crashes are well-approximated by a single source of risk, wherein several contributing factors exert their collective, non-independent influences on the occurrence of crashes via a linear predictor. This study first postulates, and then demonstrates empirically, that crash occurrence may be more complex than can be adequately captured by a single equation regression model. The total crash count recorded at a transport network location (e.g. road segment) may arise from multiple simultaneous and inter-dependent sources of risk, rather than one. Each of these sources may uniquely contribute to the total observed crash count. For instance, a site's crash occurrence may be dominated by contributions from driver behaviour issues (e.g. speeding, impaired driving), while another site's crashes might arise predominately from design and operational deficiencies such as deteriorating pavements and worn lane markings. Stated succinctly, this research hypothesises that the unobserved heterogeneity in the accumulation of motor vehicle crashes at transport network locations arises because multiple sources of risk, not one, better captures complexity in the crash occurrence process. A stochastic multiple risk source methodological approach is developed to correspond with and empirically test this hypothesis. A joint econometric model with random parameters and instrumental variables demonstrates the applicability of the proposed theory and the corresponding methodological approach. The proposed model assumes that complexity of crash occurrence is well approximated using three sources of risk comprised of engineering, unobserved spatial, and driver behavioural factors. It is empirically tested using crash data from state controlled roads in Queensland, Australia. Finally, the multiple risk source model is compared to the traditional single risk source model to assess the viability of the proposed approach based on the sample data. The multiple risk source model significantly outperformed the single risk source model in terms of prediction ability and goodness of fit measures. In addition, while the single risk source model predicts total crash counts for individual sites, the multiple risk source model predicts crash count proportions contributed by each source of risk, and predicts crashes by risk source. The improvement in fit combined with the theoretical appeal of a multiple risk source model to explain unobserved heterogeneity in crashes suggests—at least for the sample used in the study—that the complexity in crash occurrence is better explained using multiple equation linear predictors. Further research should examine other datasets for repeatability and should further explore and test risk sources.

AB - In the safety literature, motor vehicle crashes are modelled predominately using single equation regression models, albeit with a variety of distributional assumptions and econometric enhancements. These models rely on a single linear additive predictive equation—which becomes multiplicative with a log transform—to specify the expected mean crash count conditioned on predictors. The models also specify the distribution of observations around the conditional mean, with common examples including the Poisson, Negative Binomial, and Conway-Maxwell distribution among others. This mainstream probabilistic conceptualization (i.e. model) of motor vehicle crash causation assumes that crashes are well-approximated by a single source of risk, wherein several contributing factors exert their collective, non-independent influences on the occurrence of crashes via a linear predictor. This study first postulates, and then demonstrates empirically, that crash occurrence may be more complex than can be adequately captured by a single equation regression model. The total crash count recorded at a transport network location (e.g. road segment) may arise from multiple simultaneous and inter-dependent sources of risk, rather than one. Each of these sources may uniquely contribute to the total observed crash count. For instance, a site's crash occurrence may be dominated by contributions from driver behaviour issues (e.g. speeding, impaired driving), while another site's crashes might arise predominately from design and operational deficiencies such as deteriorating pavements and worn lane markings. Stated succinctly, this research hypothesises that the unobserved heterogeneity in the accumulation of motor vehicle crashes at transport network locations arises because multiple sources of risk, not one, better captures complexity in the crash occurrence process. A stochastic multiple risk source methodological approach is developed to correspond with and empirically test this hypothesis. A joint econometric model with random parameters and instrumental variables demonstrates the applicability of the proposed theory and the corresponding methodological approach. The proposed model assumes that complexity of crash occurrence is well approximated using three sources of risk comprised of engineering, unobserved spatial, and driver behavioural factors. It is empirically tested using crash data from state controlled roads in Queensland, Australia. Finally, the multiple risk source model is compared to the traditional single risk source model to assess the viability of the proposed approach based on the sample data. The multiple risk source model significantly outperformed the single risk source model in terms of prediction ability and goodness of fit measures. In addition, while the single risk source model predicts total crash counts for individual sites, the multiple risk source model predicts crash count proportions contributed by each source of risk, and predicts crashes by risk source. The improvement in fit combined with the theoretical appeal of a multiple risk source model to explain unobserved heterogeneity in crashes suggests—at least for the sample used in the study—that the complexity in crash occurrence is better explained using multiple equation linear predictors. Further research should examine other datasets for repeatability and should further explore and test risk sources.

KW - Crash causation mechanism

KW - Data generating process

KW - Instrumental variable

KW - Joint model

KW - Random parameters model

KW - Structural equation model

UR - http://www.scopus.com/inward/record.url?scp=85049315263&partnerID=8YFLogxK

U2 - 10.1016/j.amar.2018.03.002

DO - 10.1016/j.amar.2018.03.002

M3 - Article

AN - SCOPUS:85049315263

SN - 2213-6657

VL - 18

SP - 1

EP - 14

JO - Analytic Methods in Accident Research

JF - Analytic Methods in Accident Research

ER -

A comprehensive joint econometric model of motor vehicle crashes arising from multiple sources of risk

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this