Accurate prediction of protein function using statistics-informed graph networks

Abstract Understanding protein function is pivotal in comprehending the intricate mechanisms that underlie many crucial biological activities, with far-reaching implications in the fields of medicine, biotechnology, and drug development. However, more than 200 million proteins remain uncharacterized...

Full description

Saved in:
Bibliographic Details
Main Authors: Yaan J. Jang, Qi-Qi Qin, Si-Yu Huang, Arun T. John Peter, Xue-Ming Ding, Benoît Kornmann
Format: Article
Language:English
Published: Nature Portfolio 2024-08-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-50955-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849334338552856576
author Yaan J. Jang
Qi-Qi Qin
Si-Yu Huang
Arun T. John Peter
Xue-Ming Ding
Benoît Kornmann
author_facet Yaan J. Jang
Qi-Qi Qin
Si-Yu Huang
Arun T. John Peter
Xue-Ming Ding
Benoît Kornmann
author_sort Yaan J. Jang
collection DOAJ
description Abstract Understanding protein function is pivotal in comprehending the intricate mechanisms that underlie many crucial biological activities, with far-reaching implications in the fields of medicine, biotechnology, and drug development. However, more than 200 million proteins remain uncharacterized, and computational efforts heavily rely on protein structural information to predict annotations of varying quality. Here, we present a method that utilizes statistics-informed graph networks to predict protein functions solely from its sequence. Our method inherently characterizes evolutionary signatures, allowing for a quantitative assessment of the significance of residues that carry out specific functions. PhiGnet not only demonstrates superior performance compared to alternative approaches but also narrows the sequence-function gap, even in the absence of structural information. Our findings indicate that applying deep learning to evolutionary data can highlight functional sites at the residue level, providing valuable support for interpreting both existing properties and new functionalities of proteins in research and biomedicine.
format Article
id doaj-art-bcf4a929b2d748e5abbab148c6535708
institution Kabale University
issn 2041-1723
language English
publishDate 2024-08-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-bcf4a929b2d748e5abbab148c65357082025-08-20T03:45:35ZengNature PortfolioNature Communications2041-17232024-08-0115111210.1038/s41467-024-50955-0Accurate prediction of protein function using statistics-informed graph networksYaan J. Jang0Qi-Qi Qin1Si-Yu Huang2Arun T. John Peter3Xue-Ming Ding4Benoît Kornmann5Department of Biochemistry, University of OxfordAmoAi TechnologiesAmoAi TechnologiesInstitute of Biochemistry, ETH ZürichSchool of Optical-Electrical and Computer Engineering, University of Shanghai for Science and TechnologyDepartment of Biochemistry, University of OxfordAbstract Understanding protein function is pivotal in comprehending the intricate mechanisms that underlie many crucial biological activities, with far-reaching implications in the fields of medicine, biotechnology, and drug development. However, more than 200 million proteins remain uncharacterized, and computational efforts heavily rely on protein structural information to predict annotations of varying quality. Here, we present a method that utilizes statistics-informed graph networks to predict protein functions solely from its sequence. Our method inherently characterizes evolutionary signatures, allowing for a quantitative assessment of the significance of residues that carry out specific functions. PhiGnet not only demonstrates superior performance compared to alternative approaches but also narrows the sequence-function gap, even in the absence of structural information. Our findings indicate that applying deep learning to evolutionary data can highlight functional sites at the residue level, providing valuable support for interpreting both existing properties and new functionalities of proteins in research and biomedicine.https://doi.org/10.1038/s41467-024-50955-0
spellingShingle Yaan J. Jang
Qi-Qi Qin
Si-Yu Huang
Arun T. John Peter
Xue-Ming Ding
Benoît Kornmann
Accurate prediction of protein function using statistics-informed graph networks
Nature Communications
title Accurate prediction of protein function using statistics-informed graph networks
title_full Accurate prediction of protein function using statistics-informed graph networks
title_fullStr Accurate prediction of protein function using statistics-informed graph networks
title_full_unstemmed Accurate prediction of protein function using statistics-informed graph networks
title_short Accurate prediction of protein function using statistics-informed graph networks
title_sort accurate prediction of protein function using statistics informed graph networks
url https://doi.org/10.1038/s41467-024-50955-0
work_keys_str_mv AT yaanjjang accuratepredictionofproteinfunctionusingstatisticsinformedgraphnetworks
AT qiqiqin accuratepredictionofproteinfunctionusingstatisticsinformedgraphnetworks
AT siyuhuang accuratepredictionofproteinfunctionusingstatisticsinformedgraphnetworks
AT aruntjohnpeter accuratepredictionofproteinfunctionusingstatisticsinformedgraphnetworks
AT xuemingding accuratepredictionofproteinfunctionusingstatisticsinformedgraphnetworks
AT benoitkornmann accuratepredictionofproteinfunctionusingstatisticsinformedgraphnetworks