Accurate assembly of full-length consensus for viral quasispecies

Abstract Background Viruses can inhabit their hosts in the form of an ensemble of various mutant strains. Reconstructing a robust consensus representation for these diverse mutant strains is essential for recognizing the genetic variations among strains and delving into aspects like virulence, patho...

Full description

Saved in:
Bibliographic Details
Main Authors: Jia Tian, Ziyu Gao, Minghao Li, Ergude Bao, Jin Zhao
Format: Article
Language:English
Published: BMC 2025-02-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06045-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571225499500544
author Jia Tian
Ziyu Gao
Minghao Li
Ergude Bao
Jin Zhao
author_facet Jia Tian
Ziyu Gao
Minghao Li
Ergude Bao
Jin Zhao
author_sort Jia Tian
collection DOAJ
description Abstract Background Viruses can inhabit their hosts in the form of an ensemble of various mutant strains. Reconstructing a robust consensus representation for these diverse mutant strains is essential for recognizing the genetic variations among strains and delving into aspects like virulence, pathogenesis, and selecting therapies. Virus genomes are typically small, often composed of only a few thousand to several hundred thousand nucleotides. While constructing a high-quality consensus of virus strains might seem feasible, most current assemblers only generated fragmented contigs. It’s important to emphasize the significance of assembling a single full-length consensus contig, as it’s vital for identifying genetic diversity and estimating strain abundance accurately. Results In this paper, we developed FC-Virus, a de novo genome assembly strategy specifically targeting highly diverse viral populations. FC-Virus first identifies the k-mers that are common across most viral strains, and then uses these k-mers as a backbone to build a full-length consensus sequence covering the entire genome. We benchmark FC-Virus against state-of-the-art genome assemblers. Conclusion Experimental results confirm that FC-Virus can construct a single, accurate full-length consensus, whereas other assemblers only manage to produce fragmented contigs. FC-Virus is freely available at https://github.com/qdu-bioinfo/FC-Virus.git .
format Article
id doaj-art-384d2fc34fb14108a466197441c46e04
institution Kabale University
issn 1471-2105
language English
publishDate 2025-02-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-384d2fc34fb14108a466197441c46e042025-02-02T12:45:01ZengBMCBMC Bioinformatics1471-21052025-02-0126111510.1186/s12859-025-06045-zAccurate assembly of full-length consensus for viral quasispeciesJia Tian0Ziyu Gao1Minghao Li2Ergude Bao3Jin Zhao4College of Computer Science and Technology, Qingdao UniversityCollege of Computer Science and Technology, Qingdao UniversityCollege of Computer Science and Technology, Qingdao UniversitySchool of Software Engineering, Beijing Jiaotong UniversityCollege of Computer Science and Technology, Qingdao UniversityAbstract Background Viruses can inhabit their hosts in the form of an ensemble of various mutant strains. Reconstructing a robust consensus representation for these diverse mutant strains is essential for recognizing the genetic variations among strains and delving into aspects like virulence, pathogenesis, and selecting therapies. Virus genomes are typically small, often composed of only a few thousand to several hundred thousand nucleotides. While constructing a high-quality consensus of virus strains might seem feasible, most current assemblers only generated fragmented contigs. It’s important to emphasize the significance of assembling a single full-length consensus contig, as it’s vital for identifying genetic diversity and estimating strain abundance accurately. Results In this paper, we developed FC-Virus, a de novo genome assembly strategy specifically targeting highly diverse viral populations. FC-Virus first identifies the k-mers that are common across most viral strains, and then uses these k-mers as a backbone to build a full-length consensus sequence covering the entire genome. We benchmark FC-Virus against state-of-the-art genome assemblers. Conclusion Experimental results confirm that FC-Virus can construct a single, accurate full-length consensus, whereas other assemblers only manage to produce fragmented contigs. FC-Virus is freely available at https://github.com/qdu-bioinfo/FC-Virus.git .https://doi.org/10.1186/s12859-025-06045-zViral genome assemblyConsensusHomologous k-mersVial quasispecies
spellingShingle Jia Tian
Ziyu Gao
Minghao Li
Ergude Bao
Jin Zhao
Accurate assembly of full-length consensus for viral quasispecies
BMC Bioinformatics
Viral genome assembly
Consensus
Homologous k-mers
Vial quasispecies
title Accurate assembly of full-length consensus for viral quasispecies
title_full Accurate assembly of full-length consensus for viral quasispecies
title_fullStr Accurate assembly of full-length consensus for viral quasispecies
title_full_unstemmed Accurate assembly of full-length consensus for viral quasispecies
title_short Accurate assembly of full-length consensus for viral quasispecies
title_sort accurate assembly of full length consensus for viral quasispecies
topic Viral genome assembly
Consensus
Homologous k-mers
Vial quasispecies
url https://doi.org/10.1186/s12859-025-06045-z
work_keys_str_mv AT jiatian accurateassemblyoffulllengthconsensusforviralquasispecies
AT ziyugao accurateassemblyoffulllengthconsensusforviralquasispecies
AT minghaoli accurateassemblyoffulllengthconsensusforviralquasispecies
AT ergudebao accurateassemblyoffulllengthconsensusforviralquasispecies
AT jinzhao accurateassemblyoffulllengthconsensusforviralquasispecies