Symbolic regression driven by training data and prior knowledge

Jiai Kubalik, Erik Derner, Robert Babuska

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

12 Citations (Scopus)
32 Downloads (Pure)

Abstract

In symbolic regression, the search for analytic models is typically driven purely by the prediction error observed on the training data samples. However, when the data samples do not sufficiently cover the input space, the prediction error does not provide sufficient guidance toward desired models. Standard symbolic regression techniques then yield models that are partially incorrect, for instance, in terms of their steady-state characteristics or local behavior. If these properties were considered already during the search process, more accurate and relevant models could be produced. We propose a multi-objective symbolic regression approach that is driven by both the training data and the prior knowledge of the properties the desired model should manifest. The properties given in the form of formal constraints are internally represented by a set of discrete data samples on which candidate models are exactly checked. The proposed approach was experimentally evaluated on three test problems with results clearly demonstrating its capability to evolve realistic models that fit the training data well while complying with the prior knowledge of the desired model characteristics at the same time. It outperforms standard symbolic regression by several orders of magnitude in terms of the mean squared deviation from a reference model.

Original languageEnglish
Title of host publicationProceedings of the 2020 Genetic and Evolutionary Computation Conference, GECCO 2020
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages958-966
ISBN (Electronic)978-1-4503-7128-5
DOIs
Publication statusPublished - 2020
Event2020 Genetic and Evolutionary Computation Conference, GECCO 2020 - Cancun, Mexico
Duration: 8 Jul 202012 Jul 2020

Conference

Conference2020 Genetic and Evolutionary Computation Conference, GECCO 2020
Country/TerritoryMexico
CityCancun
Period8/07/2012/07/20

Keywords

  • Genetic programming
  • Model learning
  • Multi-objective optimization
  • Symbolic regression

Fingerprint

Dive into the research topics of 'Symbolic regression driven by training data and prior knowledge'. Together they form a unique fingerprint.

Cite this