TY - GEN
T1 - A novel surrogate-assisted evolutionary algorithm applied to partition-based ensemble learning
AU - Dushatskiy, Arkadiy
AU - Alderliesten, Tanja
AU - Bosman, Peter A.N.
PY - 2021
Y1 - 2021
N2 - We propose a novel surrogate-assisted Evolutionary Algorithm for solving expensive combinatorial optimization problems. We integrate a surrogate model, which is used for fitness value estimation, into a state-of-the-art P3-like variant of the Gene-Pool Optimal Mixing Algorithm (GOMEA) and adapt the resulting algorithm for solving non-binary combinatorial problems. We test the proposed algorithm on an ensemble learning problem. Ensembling several models is a common Machine Learning technique to achieve better performance. We consider ensembles of several models trained on disjoint subsets of a dataset. Finding the best dataset partitioning is naturally a combinatorial non-binary optimization problem. Fitness function evaluations can be extremely expensive if complex models, such as Deep Neural Networks, are used as learners in an ensemble. Therefore, the number of fitness function evaluations is typically limited, necessitating expensive optimization techniques. In our experiments we use five classification datasets from the OpenML-CC18 benchmark and Support-vector Machines as learners in an ensemble. The proposed algorithm demonstrates better performance than alternative approaches, including Bayesian optimization algorithms. It manages to find better solutions using just several thousand fitness function evaluations for an ensemble learning problem with up to 500 variables.
AB - We propose a novel surrogate-assisted Evolutionary Algorithm for solving expensive combinatorial optimization problems. We integrate a surrogate model, which is used for fitness value estimation, into a state-of-the-art P3-like variant of the Gene-Pool Optimal Mixing Algorithm (GOMEA) and adapt the resulting algorithm for solving non-binary combinatorial problems. We test the proposed algorithm on an ensemble learning problem. Ensembling several models is a common Machine Learning technique to achieve better performance. We consider ensembles of several models trained on disjoint subsets of a dataset. Finding the best dataset partitioning is naturally a combinatorial non-binary optimization problem. Fitness function evaluations can be extremely expensive if complex models, such as Deep Neural Networks, are used as learners in an ensemble. Therefore, the number of fitness function evaluations is typically limited, necessitating expensive optimization techniques. In our experiments we use five classification datasets from the OpenML-CC18 benchmark and Support-vector Machines as learners in an ensemble. The proposed algorithm demonstrates better performance than alternative approaches, including Bayesian optimization algorithms. It manages to find better solutions using just several thousand fitness function evaluations for an ensemble learning problem with up to 500 variables.
KW - Ensemble learning
KW - Expensive combinatorial optimization
KW - Surrogate-assisted evolutionary algorithms
UR - http://www.scopus.com/inward/record.url?scp=85110096162&partnerID=8YFLogxK
U2 - 10.1145/3449639.3459306
DO - 10.1145/3449639.3459306
M3 - Conference contribution
AN - SCOPUS:85110096162
T3 - GECCO 2021 - Proceedings of the 2021 Genetic and Evolutionary Computation Conference
SP - 583
EP - 591
BT - GECCO 2021 - Proceedings of the 2021 Genetic and Evolutionary Computation Conference
PB - ACM
T2 - 2021 Genetic and Evolutionary Computation Conference, GECCO 2021
Y2 - 10 July 2021 through 14 July 2021
ER -