Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing

Abstract Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and af...

Full description

Saved in:

Bibliographic Details
Main Authors:	Nattapong Langsiri, Navaporn Worasilchai, Laszlo Irinyi, Piroon Jenjaroenpun, Thidathip Wongsurawat, Janet Jennifer Luangsa-ard, Wieland Meyer, Ariya Chindamporn
Format:	Article
Language:	English
Published:	BMC 2023-09-01
Series:	IMA Fungus
Subjects:	Internal transcribed spacer (ITS) Targeted long-read sequencing Nanopore technology Fungal identification
Online Access:	https://doi.org/10.1186/s43008-023-00125-6
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832544881318297600
author	Nattapong Langsiri Navaporn Worasilchai Laszlo Irinyi Piroon Jenjaroenpun Thidathip Wongsurawat Janet Jennifer Luangsa-ard Wieland Meyer Ariya Chindamporn
author_facet	Nattapong Langsiri Navaporn Worasilchai Laszlo Irinyi Piroon Jenjaroenpun Thidathip Wongsurawat Janet Jennifer Luangsa-ard Wieland Meyer Ariya Chindamporn
author_sort	Nattapong Langsiri
collection	DOAJ
description	Abstract Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordability for fungal species identification. However, Sanger sequencing fails to identify mixed ITS sequences in the case of mixed infections. To overcome this limitation, different high-throughput sequencing technologies have been explored. The nanopore-based technology is now one of the most promising long-read sequencing technologies on the market as it has the potential to sequence the full-length ITS region in a single read. In this study, we established a workflow for species identification using the sequences of the entire ITS region generated by nanopore sequencing of both pure yeast isolates and mocked mixed species reads generated with different scenarios. The species used in this study included Candida albicans (n = 2), Candida tropicalis (n = 1), Nakaseomyces glabratus (formerly Candida glabrata) (n = 1), Trichosporon asahii (n = 2), Pichia kudriavzevii (formerly Candida krusei) (n = 1), and Cryptococcus neoformans (n = 1). Comparing various methods to generate the consensus sequence for fungal species identification, the results from this study indicate that read clustering using a modified version of the NanoCLUST pipeline is more sensitive than Canu or VSEARCH, as it classified species accurately with a lower abundance cluster of reads (3% abundance compared to 10% with VSEARCH). The modified NanoCLUST also reduced the number of classified clusters compared to VSEARCH, making the subsequent BLAST+ analysis faster. Subsampling of the datasets, which reduces the size of the datasets by approximately tenfold, did not significantly affect the identification results in terms of the identified species name, percent identity, query coverage, percentage of reads in the classified cluster, and the number of clusters. The ability of the method to distinguish mixed species within sub-populations of large datasets has the potential to aid computer analysis by reducing the required processing power. The herein presented new sequence analysis pipeline will facilitate better interpretation of fungal sequence data for species identification.
format	Article
id	doaj-art-3c2af0dc03144395b74c0c8bf097adcf
institution	Kabale University
issn	2210-6359
language	English
publishDate	2023-09-01
publisher	BMC
record_format	Article
series	IMA Fungus
spelling	doaj-art-3c2af0dc03144395b74c0c8bf097adcf2025-02-03T08:46:55ZengBMCIMA Fungus2210-63592023-09-0114111810.1186/s43008-023-00125-6Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencingNattapong Langsiri0Navaporn Worasilchai1Laszlo Irinyi2Piroon Jenjaroenpun3Thidathip Wongsurawat4Janet Jennifer Luangsa-ard5Wieland Meyer6Ariya Chindamporn7Medical Microbiology, Interdisciplinary Program, Graduated School, Chulalongkorn UniversityDepartment of Transfusion Medicine and Clinical Microbiology, Faculty of Allied Health Science, Chulalongkorn UniversityWestmead Clinical School, Sydney Medical School, Faculty of Medicine and Health, The University of SydneyDepartment of Biomedical Informatics, College of Medicine, University of Arkansas for Medical SciencesDepartment of Biomedical Informatics, College of Medicine, University of Arkansas for Medical SciencesNational Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA)Westmead Clinical School, Sydney Medical School, Faculty of Medicine and Health, The University of SydneyMedical Microbiology, Interdisciplinary Program, Graduated School, Chulalongkorn UniversityAbstract Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordability for fungal species identification. However, Sanger sequencing fails to identify mixed ITS sequences in the case of mixed infections. To overcome this limitation, different high-throughput sequencing technologies have been explored. The nanopore-based technology is now one of the most promising long-read sequencing technologies on the market as it has the potential to sequence the full-length ITS region in a single read. In this study, we established a workflow for species identification using the sequences of the entire ITS region generated by nanopore sequencing of both pure yeast isolates and mocked mixed species reads generated with different scenarios. The species used in this study included Candida albicans (n = 2), Candida tropicalis (n = 1), Nakaseomyces glabratus (formerly Candida glabrata) (n = 1), Trichosporon asahii (n = 2), Pichia kudriavzevii (formerly Candida krusei) (n = 1), and Cryptococcus neoformans (n = 1). Comparing various methods to generate the consensus sequence for fungal species identification, the results from this study indicate that read clustering using a modified version of the NanoCLUST pipeline is more sensitive than Canu or VSEARCH, as it classified species accurately with a lower abundance cluster of reads (3% abundance compared to 10% with VSEARCH). The modified NanoCLUST also reduced the number of classified clusters compared to VSEARCH, making the subsequent BLAST+ analysis faster. Subsampling of the datasets, which reduces the size of the datasets by approximately tenfold, did not significantly affect the identification results in terms of the identified species name, percent identity, query coverage, percentage of reads in the classified cluster, and the number of clusters. The ability of the method to distinguish mixed species within sub-populations of large datasets has the potential to aid computer analysis by reducing the required processing power. The herein presented new sequence analysis pipeline will facilitate better interpretation of fungal sequence data for species identification.https://doi.org/10.1186/s43008-023-00125-6Internal transcribed spacer (ITS)Targeted long-read sequencingNanopore technologyFungal identification
spellingShingle	Nattapong Langsiri Navaporn Worasilchai Laszlo Irinyi Piroon Jenjaroenpun Thidathip Wongsurawat Janet Jennifer Luangsa-ard Wieland Meyer Ariya Chindamporn Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing IMA Fungus Internal transcribed spacer (ITS) Targeted long-read sequencing Nanopore technology Fungal identification
title	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_full	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_fullStr	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_full_unstemmed	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_short	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_sort	targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long read nanopore sequencing
topic	Internal transcribed spacer (ITS) Targeted long-read sequencing Nanopore technology Fungal identification
url	https://doi.org/10.1186/s43008-023-00125-6
work_keys_str_mv	AT nattaponglangsiri targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT navapornworasilchai targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT laszloirinyi targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT piroonjenjaroenpun targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT thidathipwongsurawat targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT janetjenniferluangsaard targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT wielandmeyer targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing AT ariyachindamporn targetedsequencinganalysispipelineforspeciesidentificationofhumanpathogenicfungiusinglongreadnanoporesequencing

Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing

Similar Items