Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s

Alzheimer’s disease (AD) is a chronic, advanced brain sickness disease that slowly destroys memory and thinking skills and, in the end, the ability to perform routine tasks. This disease is caused by the abnormal clumping of proteins such as amyloids around the brain cells. The identification of pro...

Full description

Saved in:
Bibliographic Details
Main Authors: Sadia Khalil, Wajid Arshad Abbasi, Syed Ali Abbas, Maryum Bibi, Saiqa Andleeb, Amsa Shabir
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:Applied Computational Intelligence and Soft Computing
Online Access:http://dx.doi.org/10.1155/2024/7914178
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849738304398819328
author Sadia Khalil
Wajid Arshad Abbasi
Syed Ali Abbas
Maryum Bibi
Saiqa Andleeb
Amsa Shabir
author_facet Sadia Khalil
Wajid Arshad Abbasi
Syed Ali Abbas
Maryum Bibi
Saiqa Andleeb
Amsa Shabir
author_sort Sadia Khalil
collection DOAJ
description Alzheimer’s disease (AD) is a chronic, advanced brain sickness disease that slowly destroys memory and thinking skills and, in the end, the ability to perform routine tasks. This disease is caused by the abnormal clumping of proteins such as amyloids around the brain cells. The identification of proteins involved in Alzheimer’s is essential to understand the disease and to discover and design the drugs. Experimental processes involving in-vitro or in-vivo experiments for this purpose are very time-consuming, laborious, and highly costly. However, costly and tedious experimental procedures can be performed efficiently by targeting the most probable proteins involved in Alzheimer’s predicted and ranked through a computational method with better generalization accuracy. In this study, we have proposed a machine learning (ML)–based predictive model to identify proteins potentially involved in Alzheimer’s. Through a series of simulation studies, we have shown that our proposed model by using protein sequence information only gives state-of-the-art generalization performance with an area under the precision-recall curve of 0.93 verified through various ML-centric and biologically relevant techniques and metrics. Through data mining in this study, we have also performed feature analysis to identify the role of individual amino acids in such proteins. Python code for feature extraction, training, and evaluating our proposed models together with the dataset is available at the URL: https://sourceforge.net/projects/alzheimer-associated-proteins/files/.
format Article
id doaj-art-fd5d4f4e7bf1422c8ab30fdd41b9372d
institution DOAJ
issn 1687-9732
language English
publishDate 2024-01-01
publisher Wiley
record_format Article
series Applied Computational Intelligence and Soft Computing
spelling doaj-art-fd5d4f4e7bf1422c8ab30fdd41b9372d2025-08-20T03:06:39ZengWileyApplied Computational Intelligence and Soft Computing1687-97322024-01-01202410.1155/2024/7914178Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’sSadia Khalil0Wajid Arshad Abbasi1Syed Ali Abbas2Maryum Bibi3Saiqa Andleeb4Amsa Shabir5Computational Biology and Data Analysis LaboratoryComputational Biology and Data Analysis LaboratoryComputational Biology and Data Analysis LaboratoryComputational Biology and Data Analysis LaboratoryBiotechnology LaboratoryDepartment of Software EngineeringAlzheimer’s disease (AD) is a chronic, advanced brain sickness disease that slowly destroys memory and thinking skills and, in the end, the ability to perform routine tasks. This disease is caused by the abnormal clumping of proteins such as amyloids around the brain cells. The identification of proteins involved in Alzheimer’s is essential to understand the disease and to discover and design the drugs. Experimental processes involving in-vitro or in-vivo experiments for this purpose are very time-consuming, laborious, and highly costly. However, costly and tedious experimental procedures can be performed efficiently by targeting the most probable proteins involved in Alzheimer’s predicted and ranked through a computational method with better generalization accuracy. In this study, we have proposed a machine learning (ML)–based predictive model to identify proteins potentially involved in Alzheimer’s. Through a series of simulation studies, we have shown that our proposed model by using protein sequence information only gives state-of-the-art generalization performance with an area under the precision-recall curve of 0.93 verified through various ML-centric and biologically relevant techniques and metrics. Through data mining in this study, we have also performed feature analysis to identify the role of individual amino acids in such proteins. Python code for feature extraction, training, and evaluating our proposed models together with the dataset is available at the URL: https://sourceforge.net/projects/alzheimer-associated-proteins/files/.http://dx.doi.org/10.1155/2024/7914178
spellingShingle Sadia Khalil
Wajid Arshad Abbasi
Syed Ali Abbas
Maryum Bibi
Saiqa Andleeb
Amsa Shabir
Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s
Applied Computational Intelligence and Soft Computing
title Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s
title_full Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s
title_fullStr Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s
title_full_unstemmed Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s
title_short Extreme Gradient Boosting Beats In-Silico Identification of Proteins Potentially Associated With Alzheimer’s
title_sort extreme gradient boosting beats in silico identification of proteins potentially associated with alzheimer s
url http://dx.doi.org/10.1155/2024/7914178
work_keys_str_mv AT sadiakhalil extremegradientboostingbeatsinsilicoidentificationofproteinspotentiallyassociatedwithalzheimers
AT wajidarshadabbasi extremegradientboostingbeatsinsilicoidentificationofproteinspotentiallyassociatedwithalzheimers
AT syedaliabbas extremegradientboostingbeatsinsilicoidentificationofproteinspotentiallyassociatedwithalzheimers
AT maryumbibi extremegradientboostingbeatsinsilicoidentificationofproteinspotentiallyassociatedwithalzheimers
AT saiqaandleeb extremegradientboostingbeatsinsilicoidentificationofproteinspotentiallyassociatedwithalzheimers
AT amsashabir extremegradientboostingbeatsinsilicoidentificationofproteinspotentiallyassociatedwithalzheimers