TY - JOUR
T1 - Information theoretic-based sampling of observations
AU - van Cranenburgh, Sander
AU - Bliemer, Michiel C.J.
N1 - Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.
PY - 2018
Y1 - 2018
N2 - Due to the surge in the amount of data that are being collected, analysts are increasingly faced with very large data sets. Estimation of sophisticated discrete choice models (such as Mixed Logit models) based on these typically large data sets can be computationally burdensome, or even infeasible. Hitherto, analysts tried to overcome these computational burdens by reverting to less computationally demanding choice models or by taking advantage of the increase in computational resources. In this paper we take a different approach: we develop a new method called Sampling of Observations (SoO) which scales down the size of the choice data set, prior to the estimation. More specifically, based on information-theoretic principles this method extracts a subset of observations from the data which is much smaller in volume than the original data set, yet produces statistically nearly identical results. We show that this method can be used to estimate sophisticated discrete choice models based on data sets that were originally too large to conduct sophisticated choice analysis.
AB - Due to the surge in the amount of data that are being collected, analysts are increasingly faced with very large data sets. Estimation of sophisticated discrete choice models (such as Mixed Logit models) based on these typically large data sets can be computationally burdensome, or even infeasible. Hitherto, analysts tried to overcome these computational burdens by reverting to less computationally demanding choice models or by taking advantage of the increase in computational resources. In this paper we take a different approach: we develop a new method called Sampling of Observations (SoO) which scales down the size of the choice data set, prior to the estimation. More specifically, based on information-theoretic principles this method extracts a subset of observations from the data which is much smaller in volume than the original data set, yet produces statistically nearly identical results. We show that this method can be used to estimate sophisticated discrete choice models based on data sets that were originally too large to conduct sophisticated choice analysis.
UR - http://resolver.tudelft.nl/uuid:ea90ebb2-a5d4-4a1e-acd0-e7550144c6a4
UR - http://www.scopus.com/inward/record.url?scp=85044953042&partnerID=8YFLogxK
U2 - 10.1016/j.jocm.2018.02.003
DO - 10.1016/j.jocm.2018.02.003
M3 - Article
AN - SCOPUS:85044953042
SN - 1755-5345
JO - Journal of Choice Modelling
JF - Journal of Choice Modelling
ER -