Skip to main navigation Skip to search Skip to main content

Source code and data for PhD Thesis "Optimal Decision Trees: Algorithms and Applications"

Dataset

Description

Source code and data accompanying the PhD thesis Optimal Decision Trees: Algorithms and Applications by Jacobus G. M. van der Linden.

In this thesis, we study algorithms and applications for optimal decision trees. The main contributions are as follows: 1) we develop new dynamic programming methods for learning optimal decision trees for a variety of learning tasks, such as classification, regression, prescriptive policy generation, survival analysis, cost-sensitive classification, and classification with fairness constraints. (2) We empirically investigate differences between optimal and greedy decision trees. (3) We develop a new algorithm to learn optimal decision trees directly on continuous features without requiring a (coarse) binarization first. (4) we develop a new algorithm to compute the Rashomon set (the complete set of near-optimal solutions) of decision trees.


The folders in this data repository each correspond to a chapter in the thesis. Each folder has both source files and data files. All included data is publicly available. The data included is the data preprocessed as used in the papers.


This thesis is the result of several publications / preprints


E. Arslan, J.G.M. van der Linden, S. Hoogendoorn, M. Rinaldi, and E. Demirović, “SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration,” in Advances in NeurIPS (2025).
C.E. Brita, J.G.M. van der Linden, and E. Demirović, “Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound,” in Proceedings of AAAI, 11131-11139 (2025).
J.G.M. van der Linden, D. Vos, M.M. de Weerdt, S. Verwer and E. Demirović, “Optimal or Greedy Decision Trees? Revisiting their Objectives, Tuning, and Performance,” arXiv preprint arXiv:2409.12788 (2024).
M. van den Bos, J.G.M. van der Linden, and E. Demirović, “Piecewise Constant and Linear Regression Trees: An Optimal Dynamic Programming Approach,” in Proceedings of ICML, 48994-49007 (2024).
T. Huisman, J.G.M. van der Linden, and E. Demirović, “Optimal Survival Trees: A Dynamic Programming Approach,” in Proceedings of AAAI, 12680-12688 (2024).
J.G.M. van der Linden, M.M. de Weerdt, and E. Demirović, “Necessary and Sufficient Conditions for Optimal Decision Trees Using Dynamic Programming,” in Advances in NeurIPS, 9173-9212 (2023).
J.G.M. van der Linden, M.M. de Weerdt, and E. Demirović, “Fair and Optimal Decision Trees: A Dynamic Programming Approach,” in Advances in NeurIPS, 38899-38911 (2022).


It combines the data from several other repositories:

https://gitlab.tudelft.nl/jgmvanderlinde/dpf
https://github.com/algtudelft/pystreed
https://github.com/ConSol-Lab/contree
https://github.com/TimHuisman1703/streed_sa_pipeline
https://github.com/mimvdb/regression-murtree
https://github.com/ConSol-Lab/pysort
Date made available16 Feb 2026
PublisherTU Delft - 4TU.ResearchData

Cite this