Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies
IntroductionPrimary immunodeficiencies (PIDs) are a group of rare genetic disorders characterized by dysfunction of the immune system components. Early diagnosis and treatment are essential to prevent severe or life-threatening complications. PIDs are manifested by diverse clinical symptoms, posing...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-02-01
|
Series: | Frontiers in Immunology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fimmu.2025.1492751/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832539874452832256 |
---|---|
author | Ekaterina S. Porfireva Anton D. Zadorozhny Anastasia V. Rudik Dmitry A. Filimonov Alexey A. Lagunin Alexey A. Lagunin |
author_facet | Ekaterina S. Porfireva Anton D. Zadorozhny Anastasia V. Rudik Dmitry A. Filimonov Alexey A. Lagunin Alexey A. Lagunin |
author_sort | Ekaterina S. Porfireva |
collection | DOAJ |
description | IntroductionPrimary immunodeficiencies (PIDs) are a group of rare genetic disorders characterized by dysfunction of the immune system components. Early diagnosis and treatment are essential to prevent severe or life-threatening complications. PIDs are manifested by diverse clinical symptoms, posing challenges for accurate diagnosis. A key aspect of PID diagnosis is identifying specific amino acid substitutions in the proteins related with heritable diseases. In this study, we have developed classification sequence-structure-property relationships (SSPR) models for predicting the pathogenicity of amino acid substitutions (AAS) in 25 proteins associated with the most important and genetically studied PIDs and encoded genes: IL2RG, JAK3, RAG1, RAG2, ADA, DCLRE1C, CD40LG, WAS, ATM, STAT3, KMT2D, BTK, FOXP3, AIRE, FAS, ELANE, ITGB2, CYBB, G6PD, GATA2, STAT1, IFIH1, NLRP3, MEFV, and SERPING1.MethodsThe data on 4825 pathogenic and benign AASs in the selected proteins were extracted from ClinVar and gnomAD. SSPR models were created for each protein using the MultiPASS software based on the Bayesian algorithm and different levels of MNA (Multilevel Neighborhoods of Atoms) descriptors for the representation of structural formulas of protein fragments including AAS.ResultsThe accuracy of prediction was assessed through a 5-fold cross-validation and compared to other bioinformatics tools, such as SIFT4G, Polyphen2 HDIV, FATHMM, MetaSVM, PROVEAN, ClinPred, and Alpha Missense. The best SSPR models demonstrated high accuracy, with an average ROC AUC of 0.831 ± 0.037, a Balanced accuracy of (0.763 ± 0.034), MCC (0.457 ± 0.06), and F-measure (0.623 ± 0.07) across all genes, outperforming the most popular bioinformatics tools.ConclusionsThe best created SSPR models for the prediction of pathogenicity of amino acid substitutions related with PIDs have been implemented in a freely available web application SAV-Pred (Single Amino acid Variants Predictor, http://www.way2drug.com/SAV-Pred/), which may be a useful tool for medical geneticists and clinicians. The use of SAV-Pred for some clinical cases of PIDs are provided. |
format | Article |
id | doaj-art-1a20a08f7d8e49e5bd185963ad7b7455 |
institution | Kabale University |
issn | 1664-3224 |
language | English |
publishDate | 2025-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Immunology |
spelling | doaj-art-1a20a08f7d8e49e5bd185963ad7b74552025-02-05T07:32:48ZengFrontiers Media S.A.Frontiers in Immunology1664-32242025-02-011610.3389/fimmu.2025.14927511492751Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficienciesEkaterina S. Porfireva0Anton D. Zadorozhny1Anastasia V. Rudik2Dmitry A. Filimonov3Alexey A. Lagunin4Alexey A. Lagunin5Department of Bioinformatics, Pirogov Russian National Research Medical University, Moscow, RussiaDepartment of Bioinformatics, Pirogov Russian National Research Medical University, Moscow, RussiaLaboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, Moscow, RussiaLaboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, Moscow, RussiaDepartment of Bioinformatics, Pirogov Russian National Research Medical University, Moscow, RussiaLaboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, Moscow, RussiaIntroductionPrimary immunodeficiencies (PIDs) are a group of rare genetic disorders characterized by dysfunction of the immune system components. Early diagnosis and treatment are essential to prevent severe or life-threatening complications. PIDs are manifested by diverse clinical symptoms, posing challenges for accurate diagnosis. A key aspect of PID diagnosis is identifying specific amino acid substitutions in the proteins related with heritable diseases. In this study, we have developed classification sequence-structure-property relationships (SSPR) models for predicting the pathogenicity of amino acid substitutions (AAS) in 25 proteins associated with the most important and genetically studied PIDs and encoded genes: IL2RG, JAK3, RAG1, RAG2, ADA, DCLRE1C, CD40LG, WAS, ATM, STAT3, KMT2D, BTK, FOXP3, AIRE, FAS, ELANE, ITGB2, CYBB, G6PD, GATA2, STAT1, IFIH1, NLRP3, MEFV, and SERPING1.MethodsThe data on 4825 pathogenic and benign AASs in the selected proteins were extracted from ClinVar and gnomAD. SSPR models were created for each protein using the MultiPASS software based on the Bayesian algorithm and different levels of MNA (Multilevel Neighborhoods of Atoms) descriptors for the representation of structural formulas of protein fragments including AAS.ResultsThe accuracy of prediction was assessed through a 5-fold cross-validation and compared to other bioinformatics tools, such as SIFT4G, Polyphen2 HDIV, FATHMM, MetaSVM, PROVEAN, ClinPred, and Alpha Missense. The best SSPR models demonstrated high accuracy, with an average ROC AUC of 0.831 ± 0.037, a Balanced accuracy of (0.763 ± 0.034), MCC (0.457 ± 0.06), and F-measure (0.623 ± 0.07) across all genes, outperforming the most popular bioinformatics tools.ConclusionsThe best created SSPR models for the prediction of pathogenicity of amino acid substitutions related with PIDs have been implemented in a freely available web application SAV-Pred (Single Amino acid Variants Predictor, http://www.way2drug.com/SAV-Pred/), which may be a useful tool for medical geneticists and clinicians. The use of SAV-Pred for some clinical cases of PIDs are provided.https://www.frontiersin.org/articles/10.3389/fimmu.2025.1492751/fullprimary immunodeficienciesamino acid substitutionspathogenicity predictionsequence-structure-property relationshipshuman genetic variationSAV-Pred |
spellingShingle | Ekaterina S. Porfireva Anton D. Zadorozhny Anastasia V. Rudik Dmitry A. Filimonov Alexey A. Lagunin Alexey A. Lagunin Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies Frontiers in Immunology primary immunodeficiencies amino acid substitutions pathogenicity prediction sequence-structure-property relationships human genetic variation SAV-Pred |
title | Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies |
title_full | Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies |
title_fullStr | Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies |
title_full_unstemmed | Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies |
title_short | Sequence-structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies |
title_sort | sequence structure based prediction of pathogenicity for amino acid substitutions in proteins associated with primary immunodeficiencies |
topic | primary immunodeficiencies amino acid substitutions pathogenicity prediction sequence-structure-property relationships human genetic variation SAV-Pred |
url | https://www.frontiersin.org/articles/10.3389/fimmu.2025.1492751/full |
work_keys_str_mv | AT ekaterinasporfireva sequencestructurebasedpredictionofpathogenicityforaminoacidsubstitutionsinproteinsassociatedwithprimaryimmunodeficiencies AT antondzadorozhny sequencestructurebasedpredictionofpathogenicityforaminoacidsubstitutionsinproteinsassociatedwithprimaryimmunodeficiencies AT anastasiavrudik sequencestructurebasedpredictionofpathogenicityforaminoacidsubstitutionsinproteinsassociatedwithprimaryimmunodeficiencies AT dmitryafilimonov sequencestructurebasedpredictionofpathogenicityforaminoacidsubstitutionsinproteinsassociatedwithprimaryimmunodeficiencies AT alexeyalagunin sequencestructurebasedpredictionofpathogenicityforaminoacidsubstitutionsinproteinsassociatedwithprimaryimmunodeficiencies AT alexeyalagunin sequencestructurebasedpredictionofpathogenicityforaminoacidsubstitutionsinproteinsassociatedwithprimaryimmunodeficiencies |