Predicting Diabetes Risk Using Machine Learning: A Comparative Study on the Yazd Health Study (YaHS)

Diabetes is a chronic disease that can significantly affect health at the global level, highlighting the importance of accurate early risk prediction to support prevention and management efforts. This study aims to evaluate the effectiveness of some efficient machine learning algorithms: Support Vec...

Full description

Saved in:
Bibliographic Details
Main Authors: Fateme Sefid, Nazanin Norouzi-Ghahjavarestani, Malihe Soleymani-Tabasi, Jamal Zarepour-Ahmadabadi, Ghasem Azamirad, Mohamah yahya Vahidi Mehrjardi, Masoud Mirzaei, Seyed Mehdi Kalantar
Format: Article
Language:English
Published: Shahid Sadoughi University of Medical Sciences 2025-07-01
Series:Iranian Journal of Diabetes and Obesity
Subjects:
Online Access:http://ijdo.ssu.ac.ir/article-1-967-en.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Diabetes is a chronic disease that can significantly affect health at the global level, highlighting the importance of accurate early risk prediction to support prevention and management efforts. This study aims to evaluate the effectiveness of some efficient machine learning algorithms: Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Naïve Bayes (NB), and Decision Tree (DT) in diabetes risk prediction using dataset acquired from Yazd Health Study (YaHS). Extensive preprocessing steps, including data cleaning, class imbalance handling through Synthetic Minority Oversampling Technique and Edited Nearest Neighbors (SMOTEENN), and feature selection, are applied to enhance the performance of models. Among the evaluated machine learning algorithms, the Random Forest classifier achieved the highest performance with an accuracy of 97%, outperforming other methods in terms of predictive capability. The findings highlight the vital importance of effective data preprocessing and algorithm selection in developing reliable predictive models from healthcare datasets.
ISSN:2008-6792
2345-2250