Finding a most parsimonious or likely tree in a network with respect to an alignment

Steven Kelk; Fabio Pardi; Celine Scornavacca; Leo van Iersel

doi:10.1007/s00285-018-1282-2

Finding a most parsimonious or likely tree in a network with respect to an alignment

Steven Kelk^*, Fabio Pardi, Celine Scornavacca, Leo van Iersel

^*Corresponding author for this work

Discrete Mathematics and Optimization

Research output: Contribution to journal › Article › Scientific › peer-review

3 Citations (Scopus)

109 Downloads (Pure)

Abstract

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N—the maximum number of reticulation nodes within a biconnected component—is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

Original language	English
Pages (from-to)	1-21
Number of pages	21
Journal	Journal of Mathematical Biology
DOIs	https://doi.org/10.1007/s00285-018-1282-2
Publication status	E-pub ahead of print - 2018

Keywords

APX-hardness
Maximum likelihood
Maximum parsimony
NP-hardness
Phylogenetic network
Phylogenetic tree

Access to Document

10.1007/s00285-018-1282-2

46679041 - 10.1007_s00285-018-1282-2Final published version, 635 KBLicence: CC BY

Cite this

@article{4bea40d5381f495c93c96fd589717027,

title = "Finding a most parsimonious or likely tree in a network with respect to an alignment",

abstract = "Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N—the maximum number of reticulation nodes within a biconnected component—is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.",

keywords = "APX-hardness, Maximum likelihood, Maximum parsimony, NP-hardness, Phylogenetic network, Phylogenetic tree",

author = "Steven Kelk and Fabio Pardi and Celine Scornavacca and {van Iersel}, Leo",

year = "2018",

doi = "10.1007/s00285-018-1282-2",

language = "English",

pages = "1--21",

journal = "Journal of Mathematical Biology",

issn = "0303-6812",

publisher = "Springer",

}

TY - JOUR

T1 - Finding a most parsimonious or likely tree in a network with respect to an alignment

AU - Kelk, Steven

AU - Pardi, Fabio

AU - Scornavacca, Celine

AU - van Iersel, Leo

PY - 2018

Y1 - 2018

N2 - Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N—the maximum number of reticulation nodes within a biconnected component—is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

AB - Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N—the maximum number of reticulation nodes within a biconnected component—is 1 and A contains only 2 distinct states. (If, additionally, gaps are allowed the problem becomes APX-hard.) We also show that under the same conditions, and assuming a simple binary symmetric model of character evolution, finding a most likely tree displayed by the network is NP-hard. These negative results contrast with earlier work on parsimony in which it is shown that if A consists of a single column the problem is fixed parameter tractable in the level. We conclude with a discussion of why, despite the NP-hardness, both the parsimony and likelihood problem can likely be well-solved in practice.

KW - APX-hardness

KW - Maximum likelihood

KW - Maximum parsimony

KW - NP-hardness

KW - Phylogenetic network

KW - Phylogenetic tree

UR - http://www.scopus.com/inward/record.url?scp=85052617343&partnerID=8YFLogxK

UR - http://resolver.tudelft.nl/uuid://4bea40d5-381f-495c-93c9-6fd589717027

U2 - 10.1007/s00285-018-1282-2

DO - 10.1007/s00285-018-1282-2

M3 - Article

AN - SCOPUS:85052617343

SN - 0303-6812

SP - 1

EP - 21

JO - Journal of Mathematical Biology

JF - Journal of Mathematical Biology

ER -

Finding a most parsimonious or likely tree in a network with respect to an alignment

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this