TY - JOUR
T1 - HAT
T2 - haplotype assembly tool using short and error-prone long reads
AU - Shirali Hossein Zade, Ramin
AU - Urhan, Aysun
AU - Assis de Souza, Alvaro
AU - Singh, Akash
AU - Abeel, Thomas
PY - 2022
Y1 - 2022
N2 - MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
AB - MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
UR - http://www.scopus.com/inward/record.url?scp=85144585069&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btac702
DO - 10.1093/bioinformatics/btac702
M3 - Article
C2 - 36308461
AN - SCOPUS:85144585069
VL - 38
SP - 5352
EP - 5359
JO - Bioinformatics
JF - Bioinformatics
SN - 1367-4803
IS - 24
ER -