The protein structurome of Orthornavirae and its dark matter

ABSTRACT Metatranscriptomics is uncovering more and more diverse families of viruses with RNA genomes comprising the viral kingdom Orthornavirae in the realm Riboviria. Thorough protein annotation and comparison are essential to get insights into the functions of viral proteins and virus evolution....

Full description

Saved in:
Bibliographic Details
Main Authors: Pascal Mutz, Antonio Pedro Camargo, Harutyun Sahakyan, Uri Neri, Anamarija Butkovic, Yuri I. Wolf, Mart Krupovic, Valerian V. Dolja, Eugene V. Koonin
Format: Article
Language:English
Published: American Society for Microbiology 2025-02-01
Series:mBio
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/mbio.03200-24
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832096536101650432
author Pascal Mutz
Antonio Pedro Camargo
Harutyun Sahakyan
Uri Neri
Anamarija Butkovic
Yuri I. Wolf
Mart Krupovic
Valerian V. Dolja
Eugene V. Koonin
author_facet Pascal Mutz
Antonio Pedro Camargo
Harutyun Sahakyan
Uri Neri
Anamarija Butkovic
Yuri I. Wolf
Mart Krupovic
Valerian V. Dolja
Eugene V. Koonin
author_sort Pascal Mutz
collection DOAJ
description ABSTRACT Metatranscriptomics is uncovering more and more diverse families of viruses with RNA genomes comprising the viral kingdom Orthornavirae in the realm Riboviria. Thorough protein annotation and comparison are essential to get insights into the functions of viral proteins and virus evolution. In addition to sequence- and hmm profile‑based methods, protein structure comparison adds a powerful tool to uncover protein functions and relationships. We constructed an Orthornavirae “structurome” consisting of already annotated as well as unannotated (“dark matter”) proteins and domains encoded in viral genomes. We used protein structure modeling and similarity searches to illuminate the remaining dark matter in hundreds of thousands of orthornavirus genomes. The vast majority of the dark matter domains showed either “generic” folds, such as single α-helices, or no high confidence structure predictions. Nevertheless, a variety of lineage-specific globular domains that were new either to orthornaviruses in general or to particular virus families were identified within the proteomic dark matter of orthornaviruses, including several predicted nucleic acid-binding domains and nucleases. In addition, we identified a case of exaptation of a cellular nucleoside monophosphate kinase as an RNA-binding protein in several virus families. Notwithstanding the continuing discovery of numerous orthornaviruses, it appears that all the protein domains conserved in large groups of viruses have already been identified. The rest of the viral proteome seems to be dominated by poorly structured domains including intrinsically disordered ones that likely mediate specific virus-host interactions.IMPORTANCEAdvanced methods for protein structure prediction, such as AlphaFold2, greatly expand our capability to identify protein domains and infer their likely functions and evolutionary relationships. This is particularly pertinent for proteins encoded by viruses that are known to evolve rapidly and as a result often cannot be adequately characterized by analysis of the protein sequences. We performed an exhaustive structure prediction and comparative analysis for uncharacterized proteins and domains (“dark matter”) encoded by viruses with RNA genomes. The results show the dark matter of RNA virus proteome consists mostly of disordered and all-α-helical domains that cannot be readily assigned a specific function and that likely mediate various interactions between viral proteins and between viral and host proteins. The great majority of globular proteins and domains of RNA viruses are already known although we identified several unexpected domains represented in individual viral families.
format Article
id doaj-art-c04636713c8e46e2ba7f2280140f0807
institution Kabale University
issn 2150-7511
language English
publishDate 2025-02-01
publisher American Society for Microbiology
record_format Article
series mBio
spelling doaj-art-c04636713c8e46e2ba7f2280140f08072025-02-05T14:00:48ZengAmerican Society for MicrobiologymBio2150-75112025-02-0116210.1128/mbio.03200-24The protein structurome of Orthornavirae and its dark matterPascal Mutz0Antonio Pedro Camargo1Harutyun Sahakyan2Uri Neri3Anamarija Butkovic4Yuri I. Wolf5Mart Krupovic6Valerian V. Dolja7Eugene V. Koonin8Division of Intramural Research, Computational Biology Branch, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USADepartment of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USADivision of Intramural Research, Computational Biology Branch, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USADepartment of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USAInstitut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, FranceDivision of Intramural Research, Computational Biology Branch, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USAInstitut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, FranceDepartment of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, USADivision of Intramural Research, Computational Biology Branch, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USAABSTRACT Metatranscriptomics is uncovering more and more diverse families of viruses with RNA genomes comprising the viral kingdom Orthornavirae in the realm Riboviria. Thorough protein annotation and comparison are essential to get insights into the functions of viral proteins and virus evolution. In addition to sequence- and hmm profile‑based methods, protein structure comparison adds a powerful tool to uncover protein functions and relationships. We constructed an Orthornavirae “structurome” consisting of already annotated as well as unannotated (“dark matter”) proteins and domains encoded in viral genomes. We used protein structure modeling and similarity searches to illuminate the remaining dark matter in hundreds of thousands of orthornavirus genomes. The vast majority of the dark matter domains showed either “generic” folds, such as single α-helices, or no high confidence structure predictions. Nevertheless, a variety of lineage-specific globular domains that were new either to orthornaviruses in general or to particular virus families were identified within the proteomic dark matter of orthornaviruses, including several predicted nucleic acid-binding domains and nucleases. In addition, we identified a case of exaptation of a cellular nucleoside monophosphate kinase as an RNA-binding protein in several virus families. Notwithstanding the continuing discovery of numerous orthornaviruses, it appears that all the protein domains conserved in large groups of viruses have already been identified. The rest of the viral proteome seems to be dominated by poorly structured domains including intrinsically disordered ones that likely mediate specific virus-host interactions.IMPORTANCEAdvanced methods for protein structure prediction, such as AlphaFold2, greatly expand our capability to identify protein domains and infer their likely functions and evolutionary relationships. This is particularly pertinent for proteins encoded by viruses that are known to evolve rapidly and as a result often cannot be adequately characterized by analysis of the protein sequences. We performed an exhaustive structure prediction and comparative analysis for uncharacterized proteins and domains (“dark matter”) encoded by viruses with RNA genomes. The results show the dark matter of RNA virus proteome consists mostly of disordered and all-α-helical domains that cannot be readily assigned a specific function and that likely mediate various interactions between viral proteins and between viral and host proteins. The great majority of globular proteins and domains of RNA viruses are already known although we identified several unexpected domains represented in individual viral families.https://journals.asm.org/doi/10.1128/mbio.03200-24RNA virusOrthornaviriaproteomeprotein structure predictionnovel protein domains
spellingShingle Pascal Mutz
Antonio Pedro Camargo
Harutyun Sahakyan
Uri Neri
Anamarija Butkovic
Yuri I. Wolf
Mart Krupovic
Valerian V. Dolja
Eugene V. Koonin
The protein structurome of Orthornavirae and its dark matter
mBio
RNA virus
Orthornaviria
proteome
protein structure prediction
novel protein domains
title The protein structurome of Orthornavirae and its dark matter
title_full The protein structurome of Orthornavirae and its dark matter
title_fullStr The protein structurome of Orthornavirae and its dark matter
title_full_unstemmed The protein structurome of Orthornavirae and its dark matter
title_short The protein structurome of Orthornavirae and its dark matter
title_sort protein structurome of orthornavirae and its dark matter
topic RNA virus
Orthornaviria
proteome
protein structure prediction
novel protein domains
url https://journals.asm.org/doi/10.1128/mbio.03200-24
work_keys_str_mv AT pascalmutz theproteinstructuromeoforthornaviraeanditsdarkmatter
AT antoniopedrocamargo theproteinstructuromeoforthornaviraeanditsdarkmatter
AT harutyunsahakyan theproteinstructuromeoforthornaviraeanditsdarkmatter
AT urineri theproteinstructuromeoforthornaviraeanditsdarkmatter
AT anamarijabutkovic theproteinstructuromeoforthornaviraeanditsdarkmatter
AT yuriiwolf theproteinstructuromeoforthornaviraeanditsdarkmatter
AT martkrupovic theproteinstructuromeoforthornaviraeanditsdarkmatter
AT valerianvdolja theproteinstructuromeoforthornaviraeanditsdarkmatter
AT eugenevkoonin theproteinstructuromeoforthornaviraeanditsdarkmatter
AT pascalmutz proteinstructuromeoforthornaviraeanditsdarkmatter
AT antoniopedrocamargo proteinstructuromeoforthornaviraeanditsdarkmatter
AT harutyunsahakyan proteinstructuromeoforthornaviraeanditsdarkmatter
AT urineri proteinstructuromeoforthornaviraeanditsdarkmatter
AT anamarijabutkovic proteinstructuromeoforthornaviraeanditsdarkmatter
AT yuriiwolf proteinstructuromeoforthornaviraeanditsdarkmatter
AT martkrupovic proteinstructuromeoforthornaviraeanditsdarkmatter
AT valerianvdolja proteinstructuromeoforthornaviraeanditsdarkmatter
AT eugenevkoonin proteinstructuromeoforthornaviraeanditsdarkmatter