TY - JOUR

T1 - Is this network proper forest-based?

AU - Huber, Katharina T

AU - van Iersel, Leo

AU - Moulton, Vincent

AU - Scholz, Guillaume E.

PY - 2025

Y1 - 2025

N2 - In evolutionary biology, networks are becoming increasingly used to represent evolutionary histories for species that have undergone non-treelike or reticulate evolution. Such networks are essentially directed acyclic graphs with a leaf set that corresponds to a collection of species, and in which non-leaf vertices with indegree 1 correspond to speciation events and vertices with indegree greater than 1 correspond to reticulate events such as gene transfer. Recently forest-based networks have been introduced, which are essentially (multi-rooted) networks that can be formed by adding some arcs to a collection of phylogenetic trees (or phylogenetic forest), where each arc is added in such a way that its ends always lie in two different trees in the forest. In this paper, we consider the complexity of deciding whether a given network is proper forest-based, that is, whether it can be formed by adding arcs to some underlying phylogenetic forest which contains the same number of trees as there are roots in the network. More specifically, we show that it is NP-complete to decide whether a tree-child network with m roots is proper forest-based, for each m≥2. Moreover, for binary networks the problem remains NP-complete when m≥3 but becomes polynomial-time solvable for m=2. We also give a fixed parameter tractable (FPT) algorithm, with parameters the maximum outdegree of a vertex, the number of roots, and the number of indegree 2 vertices, for deciding if a semi-binary network is proper forest-based. A key element in proving our results is a new characterization for when a network with m roots is proper forest-based in terms of certain m-colorings.

AB - In evolutionary biology, networks are becoming increasingly used to represent evolutionary histories for species that have undergone non-treelike or reticulate evolution. Such networks are essentially directed acyclic graphs with a leaf set that corresponds to a collection of species, and in which non-leaf vertices with indegree 1 correspond to speciation events and vertices with indegree greater than 1 correspond to reticulate events such as gene transfer. Recently forest-based networks have been introduced, which are essentially (multi-rooted) networks that can be formed by adding some arcs to a collection of phylogenetic trees (or phylogenetic forest), where each arc is added in such a way that its ends always lie in two different trees in the forest. In this paper, we consider the complexity of deciding whether a given network is proper forest-based, that is, whether it can be formed by adding arcs to some underlying phylogenetic forest which contains the same number of trees as there are roots in the network. More specifically, we show that it is NP-complete to decide whether a tree-child network with m roots is proper forest-based, for each m≥2. Moreover, for binary networks the problem remains NP-complete when m≥3 but becomes polynomial-time solvable for m=2. We also give a fixed parameter tractable (FPT) algorithm, with parameters the maximum outdegree of a vertex, the number of roots, and the number of indegree 2 vertices, for deciding if a semi-binary network is proper forest-based. A key element in proving our results is a new characterization for when a network with m roots is proper forest-based in terms of certain m-colorings.

KW - Algorithms

KW - Forest-based network

KW - Graph coloring

KW - Phylogenetic forest

KW - Phylogenetic network

UR - http://www.scopus.com/inward/record.url?scp=85192957591&partnerID=8YFLogxK

U2 - 10.1016/j.ipl.2024.106500

DO - 10.1016/j.ipl.2024.106500

M3 - Article

AN - SCOPUS:85192957591

SN - 0020-0190

VL - 187

JO - Information Processing Letters

JF - Information Processing Letters

M1 - 106500

ER -