Development and experimental validation of a machine learning model for the prediction of new antimalarials
Abstract A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmo...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | BMC Chemistry |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13065-025-01395-4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832572080660414464 |
---|---|
author | Mukul Kore Dimple Acharya Lakshya Sharma Shruthi Sridhar Vembar Sandeep Sundriyal |
author_facet | Mukul Kore Dimple Acharya Lakshya Sharma Shruthi Sridhar Vembar Sandeep Sundriyal |
author_sort | Mukul Kore |
collection | DOAJ |
description | Abstract A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmodium falciparum were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (N ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound 1) was a potent inhibitor of β-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work. |
format | Article |
id | doaj-art-876892c3ade24127862f3273bfd1707d |
institution | Kabale University |
issn | 2661-801X |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Chemistry |
spelling | doaj-art-876892c3ade24127862f3273bfd1707d2025-02-02T12:06:51ZengBMCBMC Chemistry2661-801X2025-01-0119111910.1186/s13065-025-01395-4Development and experimental validation of a machine learning model for the prediction of new antimalarialsMukul Kore0Dimple Acharya1Lakshya Sharma2Shruthi Sridhar Vembar3Sandeep Sundriyal4Department of Pharmacy, Birla Institute of Technology and Science PilaniInstitute of Bioinformatics and Applied Biotechnology, Electronics City Phase IDepartment of Pharmacy, Birla Institute of Technology and Science PilaniInstitute of Bioinformatics and Applied Biotechnology, Electronics City Phase IDepartment of Pharmacy, Birla Institute of Technology and Science PilaniAbstract A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmodium falciparum were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (N ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound 1) was a potent inhibitor of β-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.https://doi.org/10.1186/s13065-025-01395-4MalariaMachine learningRandom forestKNIMEModellingChEMBL |
spellingShingle | Mukul Kore Dimple Acharya Lakshya Sharma Shruthi Sridhar Vembar Sandeep Sundriyal Development and experimental validation of a machine learning model for the prediction of new antimalarials BMC Chemistry Malaria Machine learning Random forest KNIME Modelling ChEMBL |
title | Development and experimental validation of a machine learning model for the prediction of new antimalarials |
title_full | Development and experimental validation of a machine learning model for the prediction of new antimalarials |
title_fullStr | Development and experimental validation of a machine learning model for the prediction of new antimalarials |
title_full_unstemmed | Development and experimental validation of a machine learning model for the prediction of new antimalarials |
title_short | Development and experimental validation of a machine learning model for the prediction of new antimalarials |
title_sort | development and experimental validation of a machine learning model for the prediction of new antimalarials |
topic | Malaria Machine learning Random forest KNIME Modelling ChEMBL |
url | https://doi.org/10.1186/s13065-025-01395-4 |
work_keys_str_mv | AT mukulkore developmentandexperimentalvalidationofamachinelearningmodelforthepredictionofnewantimalarials AT dimpleacharya developmentandexperimentalvalidationofamachinelearningmodelforthepredictionofnewantimalarials AT lakshyasharma developmentandexperimentalvalidationofamachinelearningmodelforthepredictionofnewantimalarials AT shruthisridharvembar developmentandexperimentalvalidationofamachinelearningmodelforthepredictionofnewantimalarials AT sandeepsundriyal developmentandexperimentalvalidationofamachinelearningmodelforthepredictionofnewantimalarials |