TY - JOUR
T1 - On Developing a Driver Identification Methodology Using In-Vehicle Data Recorders
AU - Moreira-Matias, Luis
AU - Farah, Haneen
PY - 2017/9/1
Y1 - 2017/9/1
N2 - Recently, cutting edge technologies to facilitate data collection have emerged on a large scale. One of the most prominent is the in-vehicle data recorder (IVDR). There are multiple ways to assign the IVDR's data to the different drivers who share the same vehicle. Irrespective of the level of sophistication, all of these technologies still suffer considerable limitations in their accuracy. The purpose of this paper is to propose a methodology, which can identify the driver of a given trip using historical trip-based data. To do so, an advanced machine learning pipeline is proposed. The main goal is to take advantage of highly available data - such as driver-labeled floating car data collected by a IVDR - to build a pattern-based algorithm able to identify the trip's driver category when its true identity is unknown. This stepwise process includes feature generation/selection, multiple heterogeneous explanatory models, and an ensemble approach (i.e., stacked generalization) to reduce their generalization error. Our goal is to provide an inexpensive alternative to existing driver identification technologies, which can serve as their complement and/or validation purposes. Experiments conducted over a real-world case study from Israel uncover the potential of this idea: it obtained an accuracy of 88% and Cohen's Kappa agreement score of 74%.
AB - Recently, cutting edge technologies to facilitate data collection have emerged on a large scale. One of the most prominent is the in-vehicle data recorder (IVDR). There are multiple ways to assign the IVDR's data to the different drivers who share the same vehicle. Irrespective of the level of sophistication, all of these technologies still suffer considerable limitations in their accuracy. The purpose of this paper is to propose a methodology, which can identify the driver of a given trip using historical trip-based data. To do so, an advanced machine learning pipeline is proposed. The main goal is to take advantage of highly available data - such as driver-labeled floating car data collected by a IVDR - to build a pattern-based algorithm able to identify the trip's driver category when its true identity is unknown. This stepwise process includes feature generation/selection, multiple heterogeneous explanatory models, and an ensemble approach (i.e., stacked generalization) to reduce their generalization error. Our goal is to provide an inexpensive alternative to existing driver identification technologies, which can serve as their complement and/or validation purposes. Experiments conducted over a real-world case study from Israel uncover the potential of this idea: it obtained an accuracy of 88% and Cohen's Kappa agreement score of 74%.
KW - classification
KW - Identification methods
KW - in-vehicle data recorders
KW - machine learning
KW - stacked generalization
KW - supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85009997629&partnerID=8YFLogxK
UR - http://resolver.tudelft.nl/uuid:d4d58bf3-c9d4-4399-b20c-6c53c489939f
U2 - 10.1109/TITS.2016.2639361
DO - 10.1109/TITS.2016.2639361
M3 - Article
AN - SCOPUS:85009997629
VL - 18
SP - 2387
EP - 2396
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
SN - 1524-9050
IS - 9
M1 - 7819486
ER -