Hybridization Number on Three Rooted Binary Trees is EPT

Leo van Iersel; Steven Kelk; Nela Lekić; Chris Whidden; Norbert Zeh

doi:10.1137/15M1036579

Hybridization Number on Three Rooted Binary Trees is EPT

Leo van Iersel, Steven Kelk, Nela Lekić, Chris Whidden, Norbert Zeh

Discrete Mathematics and Optimization

Research output: Contribution to journal › Article › Scientific › peer-review

14 Citations (Scopus)

63 Downloads (Pure)

Abstract

Phylogenetic networks are leaf-labeled directed acyclic graphs that are used to describe nontreelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all in-degrees minus the number of nodes plus one. The hybridization number problem takes as input a collection of rooted binary phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has the smallest possible hybridization number. We present an algorithm for the hybridization number problem on three binary phylogenetic trees on n leaves that runs in time O(ckpoly(n)) with k the hybridization number of an optimal network and c some (astronomical) constant. For the case of two trees, an algorithm with running time O(3.18kn) was proposed before, whereas an algorithm with running time O(ckpoly(n)), also called an EPT algorithm, had prior to this article remained elusive for more than two trees. The algorithm for two trees uses the close connection to acyclic agreement forests to achieve a linear exponent in the running time, while previous algorithms for more than two trees (explicitly or implicitly) relied on a brute force search through all possible underlying network topologies, leading to running times that are not O(ckpoly(n)) for
any c. The connection to acyclic agreement forests is much weaker for more than two trees, so even given the right agreement forest, the reconstruction of the network poses major challenges. We prove novel structural results that allow us to reconstruct a network without having to guess the underlying topology. Our techniques generalize to more than three input trees with the exception of one key lemma that maps nodes in the network to tree nodes in order to minimize the amount of guessing involved in constructing the network. The main open problem therefore is to prove results that establish such a mapping for more than three trees.

Original language	English
Pages (from-to)	1607-1631
Number of pages	25
Journal	SIAM Journal on Discrete Mathematics
Volume	30
Issue number	3
DOIs	https://doi.org/10.1137/15M1036579
Publication status	Published - 2016

Keywords

hybridization number
rooted phylogenetic tree
rooted phylogenetic network
reticulate evolution
agreement forest
fixed parameter tractability

Access to Document

10.1137/15M1036579

10235106Final published version, 644 KBLicence: Unspecified

Cite this

@article{b42f38edf02640f594312492791394e9,

title = "Hybridization Number on Three Rooted Binary Trees is EPT",

abstract = "Phylogenetic networks are leaf-labeled directed acyclic graphs that are used to describe nontreelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all in-degrees minus the number of nodes plus one. The hybridization number problem takes as input a collection of rooted binary phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has the smallest possible hybridization number. We present an algorithm for the hybridization number problem on three binary phylogenetic trees on n leaves that runs in time O(ckpoly(n)) with k the hybridization number of an optimal network and c some (astronomical) constant. For the case of two trees, an algorithm with running time O(3.18kn) was proposed before, whereas an algorithm with running time O(ckpoly(n)), also called an EPT algorithm, had prior to this article remained elusive for more than two trees. The algorithm for two trees uses the close connection to acyclic agreement forests to achieve a linear exponent in the running time, while previous algorithms for more than two trees (explicitly or implicitly) relied on a brute force search through all possible underlying network topologies, leading to running times that are not O(ckpoly(n)) forany c. The connection to acyclic agreement forests is much weaker for more than two trees, so even given the right agreement forest, the reconstruction of the network poses major challenges. We prove novel structural results that allow us to reconstruct a network without having to guess the underlying topology. Our techniques generalize to more than three input trees with the exception of one key lemma that maps nodes in the network to tree nodes in order to minimize the amount of guessing involved in constructing the network. The main open problem therefore is to prove results that establish such a mapping for more than three trees.",

keywords = "hybridization number, rooted phylogenetic tree, rooted phylogenetic network, reticulate evolution, agreement forest, fixed parameter tractability",

author = "{van Iersel}, Leo and Steven Kelk and Nela Leki{\'c} and Chris Whidden and Norbert Zeh",

year = "2016",

doi = "10.1137/15M1036579",

language = "English",

volume = "30",

pages = "1607--1631",

journal = "SIAM Journal on Discrete Mathematics",

issn = "0895-4801",

publisher = "Society for Industrial and Applied Mathematics",

number = "3",

}

TY - JOUR

T1 - Hybridization Number on Three Rooted Binary Trees is EPT

AU - van Iersel, Leo

AU - Kelk, Steven

AU - Lekić, Nela

AU - Whidden, Chris

AU - Zeh, Norbert

PY - 2016

Y1 - 2016

N2 - Phylogenetic networks are leaf-labeled directed acyclic graphs that are used to describe nontreelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all in-degrees minus the number of nodes plus one. The hybridization number problem takes as input a collection of rooted binary phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has the smallest possible hybridization number. We present an algorithm for the hybridization number problem on three binary phylogenetic trees on n leaves that runs in time O(ckpoly(n)) with k the hybridization number of an optimal network and c some (astronomical) constant. For the case of two trees, an algorithm with running time O(3.18kn) was proposed before, whereas an algorithm with running time O(ckpoly(n)), also called an EPT algorithm, had prior to this article remained elusive for more than two trees. The algorithm for two trees uses the close connection to acyclic agreement forests to achieve a linear exponent in the running time, while previous algorithms for more than two trees (explicitly or implicitly) relied on a brute force search through all possible underlying network topologies, leading to running times that are not O(ckpoly(n)) forany c. The connection to acyclic agreement forests is much weaker for more than two trees, so even given the right agreement forest, the reconstruction of the network poses major challenges. We prove novel structural results that allow us to reconstruct a network without having to guess the underlying topology. Our techniques generalize to more than three input trees with the exception of one key lemma that maps nodes in the network to tree nodes in order to minimize the amount of guessing involved in constructing the network. The main open problem therefore is to prove results that establish such a mapping for more than three trees.

AB - Phylogenetic networks are leaf-labeled directed acyclic graphs that are used to describe nontreelike evolutionary histories and are thus a generalization of phylogenetic trees. The hybridization number of a phylogenetic network is the sum of all in-degrees minus the number of nodes plus one. The hybridization number problem takes as input a collection of rooted binary phylogenetic trees and asks to construct a phylogenetic network that contains an embedding of each of the input trees and has the smallest possible hybridization number. We present an algorithm for the hybridization number problem on three binary phylogenetic trees on n leaves that runs in time O(ckpoly(n)) with k the hybridization number of an optimal network and c some (astronomical) constant. For the case of two trees, an algorithm with running time O(3.18kn) was proposed before, whereas an algorithm with running time O(ckpoly(n)), also called an EPT algorithm, had prior to this article remained elusive for more than two trees. The algorithm for two trees uses the close connection to acyclic agreement forests to achieve a linear exponent in the running time, while previous algorithms for more than two trees (explicitly or implicitly) relied on a brute force search through all possible underlying network topologies, leading to running times that are not O(ckpoly(n)) forany c. The connection to acyclic agreement forests is much weaker for more than two trees, so even given the right agreement forest, the reconstruction of the network poses major challenges. We prove novel structural results that allow us to reconstruct a network without having to guess the underlying topology. Our techniques generalize to more than three input trees with the exception of one key lemma that maps nodes in the network to tree nodes in order to minimize the amount of guessing involved in constructing the network. The main open problem therefore is to prove results that establish such a mapping for more than three trees.

KW - hybridization number

KW - rooted phylogenetic tree

KW - rooted phylogenetic network

KW - reticulate evolution

KW - agreement forest

KW - fixed parameter tractability

UR - http://resolver.tudelft.nl/uuid:b42f38ed-f026-40f5-9431-2492791394e9

U2 - 10.1137/15M1036579

DO - 10.1137/15M1036579

M3 - Article

SN - 0895-4801

VL - 30

SP - 1607

EP - 1631

JO - SIAM Journal on Discrete Mathematics

JF - SIAM Journal on Discrete Mathematics

IS - 3

ER -

Hybridization Number on Three Rooted Binary Trees is EPT

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this