Amplidiff: an optimized amplicon sequencing approach to estimating lineage abundances in viral metagenomes

Jasper van Bemmelen, Davida S. Smyth, Jasmijn A. Baaijens*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

Background: Metagenomic profiling algorithms commonly rely on genomic differences between lineages, strains, or species to infer the relative abundances of sequences present in a sample. This observation plays an important role in the analysis of diverse microbial communities, where targeted sequencing of 16S and 18S rRNA, both well-known hypervariable genomic regions, have led to insights into microbial diversity and the discovery of novel organisms. However, the variable nature of discriminatory regions can also act as a double-edged sword, as the sought-after variability can make it difficult to design primers for their amplification through PCR. Moreover, the most variable regions are not necessarily the most informative regions for the purpose of differentiation; one should focus on regions that maximize the number of lineages that can be distinguished. Results: Here we present AmpliDiff, a computational tool that simultaneously finds highly discriminatory genomic regions in viral genomes of a single species, as well as primers allowing for the amplification of these regions. We show that regions and primers found by AmpliDiff can be used to accurately estimate relative abundances of SARS-CoV-2 lineages, for example in wastewater sequencing data. We obtain errors that are comparable with using whole genome information to estimate relative abundances. Furthermore, our results show that AmpliDiff is robust against incomplete input data and that primers designed by AmpliDiff also bind to genomes sampled months after the primers were selected. Conclusions: With AmpliDiff we provide an effective, cost-efficient alternative to whole genome sequencing for estimating lineage abundances in viral metagenomes.

Original languageEnglish
Article number126
Number of pages18
JournalBMC Bioinformatics
Issue number1
Publication statusPublished - 2024


  • Abundance estimation
  • Amplicon sequencing
  • Primer design
  • Set cover problem


