Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations

Abstract In this research, advanced regression techniques are investigated for modeling intricate release patterns utilizing a high-dimensional dataset comprising more than 1500 spectrum-based variables and categorical inputs. The spectral data are collected from Raman spectroscopy for analysis of d...

Full description

Saved in:
Bibliographic Details
Main Authors: Wael A. Mahdi, Adel Alhowyan, Ahmad J. Obaidullah
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-10417-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849235411242582016
author Wael A. Mahdi
Adel Alhowyan
Ahmad J. Obaidullah
author_facet Wael A. Mahdi
Adel Alhowyan
Ahmad J. Obaidullah
author_sort Wael A. Mahdi
collection DOAJ
description Abstract In this research, advanced regression techniques are investigated for modeling intricate release patterns utilizing a high-dimensional dataset comprising more than 1500 spectrum-based variables and categorical inputs. The spectral data are collected from Raman spectroscopy for analysis of drug release from a solid dosage formulation coated with Polysaccharides (a high-dimensional dataset of 155 samples, with drug release measured at 2, 8, and 24 h). The considered drug is 5-aminosalicylic acid for colonic drug delivery, and its release was estimated using Raman data as inputs along with other categorical parameters. The models, including Kernel Ridge Regression (KRR), Kernel-based Extreme Learning Machine (K-ELM), and Quantile Regression (QR) incorporate sophisticated approaches like the Sailfish Optimizer (SFO) for hyperparameter optimization and K-fold cross-validation to enhance predictive accuracy. Notably, KRR exhibited exceptional performance, achieving an R² of 0.997 on the training set and 0.992 on the test set, with a mean squared error (MSE) of 0.0004. In comparison, K-ELM and QR achieved lower R² values of 0.923 and 0.817 on the test set, respectively. The key innovation lies in integrating these non-linear regression models with robust data preprocessing steps, including dimensionality reduction via Principal Component Analysis (PCA), categorical feature encoding through Leave-One-Out (LOO), and outlier detection using Isolation Forest. This study significantly contributes by offering a comprehensive framework for managing high-dimensional and heterogeneous datasets, while emphasizing the effectiveness of optimization strategies in predictive modeling. By accurately predicting the release of 5-ASA from polysaccharide-coated formulations, these models can aid in the design of targeted colonic delivery formulations with optimized release kinetics, ultimately enhancing the efficacy of treatments for colonic diseases.
format Article
id doaj-art-da7e700fed2645bc992daade2995f7d0
institution Kabale University
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-da7e700fed2645bc992daade2995f7d02025-08-20T04:02:46ZengNature PortfolioScientific Reports2045-23222025-07-0115111410.1038/s41598-025-10417-zCombination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulationsWael A. Mahdi0Adel Alhowyan1Ahmad J. Obaidullah2Department of Pharmaceutics, College of Pharmacy, King Saud UniversityDepartment of Pharmaceutics, College of Pharmacy, King Saud UniversityDepartment of Pharmaceutical Chemistry, College of Pharmacy, King Saud UniversityAbstract In this research, advanced regression techniques are investigated for modeling intricate release patterns utilizing a high-dimensional dataset comprising more than 1500 spectrum-based variables and categorical inputs. The spectral data are collected from Raman spectroscopy for analysis of drug release from a solid dosage formulation coated with Polysaccharides (a high-dimensional dataset of 155 samples, with drug release measured at 2, 8, and 24 h). The considered drug is 5-aminosalicylic acid for colonic drug delivery, and its release was estimated using Raman data as inputs along with other categorical parameters. The models, including Kernel Ridge Regression (KRR), Kernel-based Extreme Learning Machine (K-ELM), and Quantile Regression (QR) incorporate sophisticated approaches like the Sailfish Optimizer (SFO) for hyperparameter optimization and K-fold cross-validation to enhance predictive accuracy. Notably, KRR exhibited exceptional performance, achieving an R² of 0.997 on the training set and 0.992 on the test set, with a mean squared error (MSE) of 0.0004. In comparison, K-ELM and QR achieved lower R² values of 0.923 and 0.817 on the test set, respectively. The key innovation lies in integrating these non-linear regression models with robust data preprocessing steps, including dimensionality reduction via Principal Component Analysis (PCA), categorical feature encoding through Leave-One-Out (LOO), and outlier detection using Isolation Forest. This study significantly contributes by offering a comprehensive framework for managing high-dimensional and heterogeneous datasets, while emphasizing the effectiveness of optimization strategies in predictive modeling. By accurately predicting the release of 5-ASA from polysaccharide-coated formulations, these models can aid in the design of targeted colonic delivery formulations with optimized release kinetics, ultimately enhancing the efficacy of treatments for colonic diseases.https://doi.org/10.1038/s41598-025-10417-zDrug releaseKernel ridge regressionKernel-based extreme learning machineQuantile regressionColonic drug delivery
spellingShingle Wael A. Mahdi
Adel Alhowyan
Ahmad J. Obaidullah
Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations
Scientific Reports
Drug release
Kernel ridge regression
Kernel-based extreme learning machine
Quantile regression
Colonic drug delivery
title Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations
title_full Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations
title_fullStr Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations
title_full_unstemmed Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations
title_short Combination of machine learning and Raman spectroscopy for prediction of drug release in targeted drug delivery formulations
title_sort combination of machine learning and raman spectroscopy for prediction of drug release in targeted drug delivery formulations
topic Drug release
Kernel ridge regression
Kernel-based extreme learning machine
Quantile regression
Colonic drug delivery
url https://doi.org/10.1038/s41598-025-10417-z
work_keys_str_mv AT waelamahdi combinationofmachinelearningandramanspectroscopyforpredictionofdrugreleaseintargeteddrugdeliveryformulations
AT adelalhowyan combinationofmachinelearningandramanspectroscopyforpredictionofdrugreleaseintargeteddrugdeliveryformulations
AT ahmadjobaidullah combinationofmachinelearningandramanspectroscopyforpredictionofdrugreleaseintargeteddrugdeliveryformulations