getphylo: rapid and automatic generation of multi-locus phylogenetic trees

Abstract Background The increasing amount of genomic data calls for tools that can create genome-scale phylogenies quickly and efficiently. Existing tools rely on large reference databases or require lengthy de novo calculations to identify orthologues, meaning that they have long run times and are...

Full description

Saved in:
Bibliographic Details
Main Authors: T. J. Booth, S. Shaw, P. Cruz-Morales, T. Weber
Format: Article
Language:English
Published: BMC 2025-01-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-025-06035-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832594419664027648
author T. J. Booth
S. Shaw
P. Cruz-Morales
T. Weber
author_facet T. J. Booth
S. Shaw
P. Cruz-Morales
T. Weber
author_sort T. J. Booth
collection DOAJ
description Abstract Background The increasing amount of genomic data calls for tools that can create genome-scale phylogenies quickly and efficiently. Existing tools rely on large reference databases or require lengthy de novo calculations to identify orthologues, meaning that they have long run times and are limited in their taxonomic scope. To address this, we created getphylo, a python tool for the rapid generation of phylogenetic trees de novo from annotated sequences. Results We present getphylo (Genbank to Phylogeny), a tool that automatically builds phylogenetic trees from annotated genomes alone. Orthologues are identified heuristically by searching for singletons (single copy genes) across all input genomes and the phylogeny is inferred from a concatenated alignment of all coding sequences by maximum likelihood. We performed a thorough benchmarking of getphylo against two existing tools, autoMLST and GTDB-tk, to show that it can produce trees of comparable quality in a fraction of the time. We also demonstrate the flexibility of getphylo across four case studies including bacterial and eukaryotic genomes, and biosynthetic gene clusters. Conclusions getphylo is a quick and reliable tool for the automated generation of genome-scale phylogenetic trees. getphylo can produce phylogenies comparable to other software in a fraction of the time, without the need large local databases or intense computation. getphylo can rapidly identify orthologues from a wide variety of datasets regardless of taxonomic or genomic scope. The usability, speed, flexibility of getphylo makes it a valuable addition to the phylogenetics toolkit.
format Article
id doaj-art-8481adb84848493cb2631712b1067d20
institution Kabale University
issn 1471-2105
language English
publishDate 2025-01-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj-art-8481adb84848493cb2631712b1067d202025-01-19T12:41:00ZengBMCBMC Bioinformatics1471-21052025-01-0126111110.1186/s12859-025-06035-1getphylo: rapid and automatic generation of multi-locus phylogenetic treesT. J. Booth0S. Shaw1P. Cruz-Morales2T. Weber3The Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske UniversitetThe Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske UniversitetThe Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske UniversitetThe Novo Nordisk Foundation Center for Biosustainability, Danmarks Tekniske UniversitetAbstract Background The increasing amount of genomic data calls for tools that can create genome-scale phylogenies quickly and efficiently. Existing tools rely on large reference databases or require lengthy de novo calculations to identify orthologues, meaning that they have long run times and are limited in their taxonomic scope. To address this, we created getphylo, a python tool for the rapid generation of phylogenetic trees de novo from annotated sequences. Results We present getphylo (Genbank to Phylogeny), a tool that automatically builds phylogenetic trees from annotated genomes alone. Orthologues are identified heuristically by searching for singletons (single copy genes) across all input genomes and the phylogeny is inferred from a concatenated alignment of all coding sequences by maximum likelihood. We performed a thorough benchmarking of getphylo against two existing tools, autoMLST and GTDB-tk, to show that it can produce trees of comparable quality in a fraction of the time. We also demonstrate the flexibility of getphylo across four case studies including bacterial and eukaryotic genomes, and biosynthetic gene clusters. Conclusions getphylo is a quick and reliable tool for the automated generation of genome-scale phylogenetic trees. getphylo can produce phylogenies comparable to other software in a fraction of the time, without the need large local databases or intense computation. getphylo can rapidly identify orthologues from a wide variety of datasets regardless of taxonomic or genomic scope. The usability, speed, flexibility of getphylo makes it a valuable addition to the phylogenetics toolkit.https://doi.org/10.1186/s12859-025-06035-1PhylogeneticsSoftwareEvolutionOrthologyTaxonomyGenomics
spellingShingle T. J. Booth
S. Shaw
P. Cruz-Morales
T. Weber
getphylo: rapid and automatic generation of multi-locus phylogenetic trees
BMC Bioinformatics
Phylogenetics
Software
Evolution
Orthology
Taxonomy
Genomics
title getphylo: rapid and automatic generation of multi-locus phylogenetic trees
title_full getphylo: rapid and automatic generation of multi-locus phylogenetic trees
title_fullStr getphylo: rapid and automatic generation of multi-locus phylogenetic trees
title_full_unstemmed getphylo: rapid and automatic generation of multi-locus phylogenetic trees
title_short getphylo: rapid and automatic generation of multi-locus phylogenetic trees
title_sort getphylo rapid and automatic generation of multi locus phylogenetic trees
topic Phylogenetics
Software
Evolution
Orthology
Taxonomy
Genomics
url https://doi.org/10.1186/s12859-025-06035-1
work_keys_str_mv AT tjbooth getphylorapidandautomaticgenerationofmultilocusphylogenetictrees
AT sshaw getphylorapidandautomaticgenerationofmultilocusphylogenetictrees
AT pcruzmorales getphylorapidandautomaticgenerationofmultilocusphylogenetictrees
AT tweber getphylorapidandautomaticgenerationofmultilocusphylogenetictrees