Information theoretic-based sampling of observations

Sander van Cranenburgh*, Michiel C.J. Bliemer

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

5 Citations (Scopus)
16 Downloads (Pure)


Due to the surge in the amount of data that are being collected, analysts are increasingly faced with very large data sets. Estimation of sophisticated discrete choice models (such as Mixed Logit models) based on these typically large data sets can be computationally burdensome, or even infeasible. Hitherto, analysts tried to overcome these computational burdens by reverting to less computationally demanding choice models or by taking advantage of the increase in computational resources. In this paper we take a different approach: we develop a new method called Sampling of Observations (SoO) which scales down the size of the choice data set, prior to the estimation. More specifically, based on information-theoretic principles this method extracts a subset of observations from the data which is much smaller in volume than the original data set, yet produces statistically nearly identical results. We show that this method can be used to estimate sophisticated discrete choice models based on data sets that were originally too large to conduct sophisticated choice analysis.

Original languageEnglish
Number of pages34
JournalJournal of Choice Modelling
Publication statusPublished - 2018

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.


Dive into the research topics of 'Information theoretic-based sampling of observations'. Together they form a unique fingerprint.

Cite this