Evolutionary learning of syntax patterns for genic interaction extraction

Alberto Bartoli, Andrea De Lorenzo, Eric Medvet, Fabiano Tarlao, Marco Virgolin

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

2 Citations (Scopus)

Abstract

There is an increasing interest in the development of techniques for automatic relation extraction from unstructured text. The biomedical domain, in particular, is a sector that may greatly benefit from those techniques due to the huge and ever increasing amount of scientific publications describing observed phenomena of potential clinical interest. In this paper, we consider the problem of automatically identifying sentences that contain interactions between genes and proteins, based solely on a dictionary of genes and proteins and a small set of sample sentences in natural language. We propose an evolutionary technique for learning a classifier that is capable of detecting the desired sentences within scientific publications with high accuracy. The key feature of our proposal, that is internally based on Genetic Programming, is the construction of a model of the relevant syntax patterns in terms of standard part-of-speech annotations. The model consists of a set of regular expressions that are learned automatically despite the large alphabet size involved. We assess our approach on two realistic datasets and obtain 74% accuracy, a value sufficiently high to be of practical interest and that is in line with significant baseline methods.

Original languageEnglish
Title of host publicationGECCO 2015 - Proceedings of the 2015 Genetic and Evolutionary Computation Conference
EditorsSara Silva
PublisherAssociation for Computing Machinery (ACM)
Pages1183-1190
Number of pages8
ISBN (Electronic)9781450334723
DOIs
Publication statusPublished - 11 Jul 2015
Externally publishedYes
Event16th Genetic and Evolutionary Computation Conference, GECCO 2015 - Madrid, Spain
Duration: 11 Jul 201515 Jul 2015

Publication series

NameGECCO 2015 - Proceedings of the 2015 Genetic and Evolutionary Computation Conference

Conference

Conference16th Genetic and Evolutionary Computation Conference, GECCO 2015
CountrySpain
CityMadrid
Period11/07/1515/07/15

Keywords

  • Genetic Programming
  • Machine learning
  • Programming by example
  • Regular expressions

Fingerprint Dive into the research topics of 'Evolutionary learning of syntax patterns for genic interaction extraction'. Together they form a unique fingerprint.

  • Cite this

    Bartoli, A., De Lorenzo, A., Medvet, E., Tarlao, F., & Virgolin, M. (2015). Evolutionary learning of syntax patterns for genic interaction extraction. In S. Silva (Ed.), GECCO 2015 - Proceedings of the 2015 Genetic and Evolutionary Computation Conference (pp. 1183-1190). (GECCO 2015 - Proceedings of the 2015 Genetic and Evolutionary Computation Conference). Association for Computing Machinery (ACM). https://doi.org/10.1145/2739480.2754706