TY - GEN
T1 - Evolvability degeneration in multi-objective genetic programming for symbolic regression
AU - Liu, Dazhuang
AU - Virgolin, Marco
AU - Alderliesten, Tanja
AU - Bosman, Peter A.N.
PY - 2022
Y1 - 2022
N2 - Genetic programming (GP) is one of the best approaches today to discover symbolic regression models. To find models that trade off accuracy and complexity, the non-dominated sorting genetic algorithm II (NSGA-II) is widely used. Unfortunately, it has been shown that NSGA-II can be inefficient: in early generations, low-complexity models over-replicate and take over most of the population. Consequently, studies have proposed different approaches to promote diversity. Here, we study the root of this problem, in order to design a superior approach. We find that the over-replication of low complexity-models is due to a lack of evolvability, i.e., the inability to produce offspring with improved accuracy. We therefore extend NSGA-II to track, over time, the evolvability of models of different levels of complexity. With this information, we limit how many models of each complexity level are allowed to survive the generation. We compare this new version of NSGA-II, evoNSGA-II, with the use of seven existing multi-objective GP approaches on ten widely-used data sets, and find that evoNSGA-II is equal or superior to using these approaches in almost all comparisons. Furthermore, our results confirm that evoNSGA-II behaves as intended: models that are more evolvable form the majority of the population. Code: https://github.com/dzhliu/evoNSGA-II
AB - Genetic programming (GP) is one of the best approaches today to discover symbolic regression models. To find models that trade off accuracy and complexity, the non-dominated sorting genetic algorithm II (NSGA-II) is widely used. Unfortunately, it has been shown that NSGA-II can be inefficient: in early generations, low-complexity models over-replicate and take over most of the population. Consequently, studies have proposed different approaches to promote diversity. Here, we study the root of this problem, in order to design a superior approach. We find that the over-replication of low complexity-models is due to a lack of evolvability, i.e., the inability to produce offspring with improved accuracy. We therefore extend NSGA-II to track, over time, the evolvability of models of different levels of complexity. With this information, we limit how many models of each complexity level are allowed to survive the generation. We compare this new version of NSGA-II, evoNSGA-II, with the use of seven existing multi-objective GP approaches on ten widely-used data sets, and find that evoNSGA-II is equal or superior to using these approaches in almost all comparisons. Furthermore, our results confirm that evoNSGA-II behaves as intended: models that are more evolvable form the majority of the population. Code: https://github.com/dzhliu/evoNSGA-II
KW - evolvability
KW - genetic programming
KW - multi-objective optimization
KW - Symbolic regression
UR - http://www.scopus.com/inward/record.url?scp=85135220584&partnerID=8YFLogxK
U2 - 10.1145/3512290.3528787
DO - 10.1145/3512290.3528787
M3 - Conference contribution
AN - SCOPUS:85135220584
T3 - GECCO 2022 - Proceedings of the 2022 Genetic and Evolutionary Computation Conference
SP - 973
EP - 981
BT - GECCO 2022 - Proceedings of the 2022 Genetic and Evolutionary Computation Conference
PB - Association for Computing Machinery (ACM)
T2 - 2022 Genetic and Evolutionary Computation Conference, GECCO 2022
Y2 - 9 July 2022 through 13 July 2022
ER -