TY - JOUR
T1 - Computational Methods for Strain-Level Microbial Detection in Colony and Metagenome Sequencing Data
AU - Anyansi, Christine
AU - Straub, Timothy J.
AU - Manson, Abigail L.
AU - Earl, Ashlee M.
AU - Abeel, Thomas
PY - 2020
Y1 - 2020
N2 - Metagenomic sequencing is a powerful tool for examining the diversity and complexity of microbial communities. Most widely used tools for taxonomic profiling of metagenomic sequence data allow for a species-level overview of the composition. However, individual strains within a species can differ greatly in key genotypic and phenotypic characteristics, such as drug resistance, virulence and growth rate. Therefore, the ability to resolve microbial communities down to the level of individual strains within a species is critical to interpreting metagenomic data for clinical and environmental applications, where identifying a particular strain, or tracking a particular strain across a set of samples, can help aid in clinical diagnosis and treatment, or in characterizing yet unstudied strains across novel environmental locations. Recently published approaches have begun to tackle the problem of resolving strains within a particular species in metagenomic samples. In this review, we present an overview of these new algorithms and their uses, including methods based on assembly reconstruction and methods operating with or without a reference database. While existing metagenomic analysis methods show reasonable performance at the species and higher taxonomic levels, identifying closely related strains within a species presents a bigger challenge, due to the diversity of databases, genetic relatedness, and goals when conducting these analyses. Selection of which metagenomic tool to employ for a specific application should be performed on a case-by case basis as these tools have strengths and weaknesses that affect their performance on specific tasks. A comprehensive benchmark across different use case scenarios is vital to validate performance of these tools on microbial samples. Because strain-level metagenomic analysis is still in its infancy, development of more fine-grained, high-resolution algorithms will continue to be in demand for the future.
AB - Metagenomic sequencing is a powerful tool for examining the diversity and complexity of microbial communities. Most widely used tools for taxonomic profiling of metagenomic sequence data allow for a species-level overview of the composition. However, individual strains within a species can differ greatly in key genotypic and phenotypic characteristics, such as drug resistance, virulence and growth rate. Therefore, the ability to resolve microbial communities down to the level of individual strains within a species is critical to interpreting metagenomic data for clinical and environmental applications, where identifying a particular strain, or tracking a particular strain across a set of samples, can help aid in clinical diagnosis and treatment, or in characterizing yet unstudied strains across novel environmental locations. Recently published approaches have begun to tackle the problem of resolving strains within a particular species in metagenomic samples. In this review, we present an overview of these new algorithms and their uses, including methods based on assembly reconstruction and methods operating with or without a reference database. While existing metagenomic analysis methods show reasonable performance at the species and higher taxonomic levels, identifying closely related strains within a species presents a bigger challenge, due to the diversity of databases, genetic relatedness, and goals when conducting these analyses. Selection of which metagenomic tool to employ for a specific application should be performed on a case-by case basis as these tools have strengths and weaknesses that affect their performance on specific tasks. A comprehensive benchmark across different use case scenarios is vital to validate performance of these tools on microbial samples. Because strain-level metagenomic analysis is still in its infancy, development of more fine-grained, high-resolution algorithms will continue to be in demand for the future.
KW - bioinformatics
KW - metagenomics
KW - methods review
KW - microbial detection
KW - strain-level classification
KW - whole genome sequencing
UR - http://www.scopus.com/inward/record.url?scp=85090039663&partnerID=8YFLogxK
U2 - 10.3389/fmicb.2020.01925
DO - 10.3389/fmicb.2020.01925
M3 - Review article
AN - SCOPUS:85090039663
SN - 1664-302X
VL - 11
SP - 1
EP - 17
JO - Frontiers in Microbiology
JF - Frontiers in Microbiology
M1 - 1925
ER -