Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)

Plant phylogenetics has been revolutionised in the genomic era, with target capture acting as the primary workhorse of most recent research in the new field of phylogenomics. Target capture (aka Hyb-Seq) allows researchers to sequence hundreds of genomic regions (loci) of their choosing, at relative...

Full description

Saved in:
Bibliographic Details
Main Authors: Seth D. Musker, Nicolai M. Nürk, Michael D. Pirie
Format: Article
Language:English
Published: Pensoft Publishers 2025-01-01
Series:PhytoKeys
Online Access:https://phytokeys.pensoft.net/article/136373/download/pdf/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595693960691712
author Seth D. Musker
Nicolai M. Nürk
Michael D. Pirie
author_facet Seth D. Musker
Nicolai M. Nürk
Michael D. Pirie
author_sort Seth D. Musker
collection DOAJ
description Plant phylogenetics has been revolutionised in the genomic era, with target capture acting as the primary workhorse of most recent research in the new field of phylogenomics. Target capture (aka Hyb-Seq) allows researchers to sequence hundreds of genomic regions (loci) of their choosing, at relatively low cost per sample, from which to derive phylogenetically informative data. Although this highly flexible and widely applicable method has rightly earned its place as the field’s de facto standard, it does not come without its challenges. In particular, users have to specify which loci to sequence—a surprisingly difficult task, especially when working with non-model groups, as it requires pre-existing genomic resources in the form of assembled genomes and/or transcriptomes. In the absence of taxon-specific genomic resources, target sets exist that are designed to work across broad taxonomic scales. However, the highly conserved loci that they target may lack informativeness for difficult phylogenetic problems, such as that presented by the rapid radiation of Erica in southern Africa. We designed a target set for Erica phylogenomics intended to maximise informativeness and minimise paralogy while maintaining universality by including genes from the widely used Angiosperms353 set. Comprising just over 300 genes, the targets had excellent recovery rates in roughly 90 Erica species as well as outgroups from Calluna, Daboecia, and Rhododendron, and had high information content as measured by parsimony informative sites and Quartet Internode Resolution Probability (QIRP) at shallow nodes. Notably, QIRP was positively correlated with intron content, while including introns in targets—rather than recovering them via exon-flanking “bycatch”—substantially improved intron recovery. Overall, our results show the value of building a custom target set, and we provide a suite of open-source tools that can be used to replicate our approach in other groups (https://github.com/SethMusker/TargetVet).
format Article
id doaj-art-8a706e1087b84b35a140996c1f468038
institution Kabale University
issn 1314-2003
language English
publishDate 2025-01-01
publisher Pensoft Publishers
record_format Article
series PhytoKeys
spelling doaj-art-8a706e1087b84b35a140996c1f4680382025-01-18T08:31:00ZengPensoft PublishersPhytoKeys1314-20032025-01-012518711810.3897/phytokeys.251.136373136373Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)Seth D. Musker0Nicolai M. Nürk1Michael D. Pirie2University of BayreuthUniversity of BayreuthThe University of BergenPlant phylogenetics has been revolutionised in the genomic era, with target capture acting as the primary workhorse of most recent research in the new field of phylogenomics. Target capture (aka Hyb-Seq) allows researchers to sequence hundreds of genomic regions (loci) of their choosing, at relatively low cost per sample, from which to derive phylogenetically informative data. Although this highly flexible and widely applicable method has rightly earned its place as the field’s de facto standard, it does not come without its challenges. In particular, users have to specify which loci to sequence—a surprisingly difficult task, especially when working with non-model groups, as it requires pre-existing genomic resources in the form of assembled genomes and/or transcriptomes. In the absence of taxon-specific genomic resources, target sets exist that are designed to work across broad taxonomic scales. However, the highly conserved loci that they target may lack informativeness for difficult phylogenetic problems, such as that presented by the rapid radiation of Erica in southern Africa. We designed a target set for Erica phylogenomics intended to maximise informativeness and minimise paralogy while maintaining universality by including genes from the widely used Angiosperms353 set. Comprising just over 300 genes, the targets had excellent recovery rates in roughly 90 Erica species as well as outgroups from Calluna, Daboecia, and Rhododendron, and had high information content as measured by parsimony informative sites and Quartet Internode Resolution Probability (QIRP) at shallow nodes. Notably, QIRP was positively correlated with intron content, while including introns in targets—rather than recovering them via exon-flanking “bycatch”—substantially improved intron recovery. Overall, our results show the value of building a custom target set, and we provide a suite of open-source tools that can be used to replicate our approach in other groups (https://github.com/SethMusker/TargetVet).https://phytokeys.pensoft.net/article/136373/download/pdf/
spellingShingle Seth D. Musker
Nicolai M. Nürk
Michael D. Pirie
Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
PhytoKeys
title Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
title_full Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
title_fullStr Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
title_full_unstemmed Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
title_short Maximising informativeness for target capture-based phylogenomics in Erica (Ericaceae)
title_sort maximising informativeness for target capture based phylogenomics in erica ericaceae
url https://phytokeys.pensoft.net/article/136373/download/pdf/
work_keys_str_mv AT sethdmusker maximisinginformativenessfortargetcapturebasedphylogenomicsinericaericaceae
AT nicolaimnurk maximisinginformativenessfortargetcapturebasedphylogenomicsinericaericaceae
AT michaeldpirie maximisinginformativenessfortargetcapturebasedphylogenomicsinericaericaceae