Performance Analysis of Diabetes Detection Using Machine Learning Classifiers
Diabetes is a chronic medical condition that has been causing severe public health challenges in not only Canada, but the entire world, for as long as time immemorial, impacting millions of people and putting pressure on healthcare resources. That said, conventional diagnostic procedures sometimes d...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IJMADA
2024-10-01
|
Series: | International Journal of Management and Data Analytics |
Subjects: | |
Online Access: | https://ijmada.com/index.php/ijmada/article/view/50 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832593415170162688 |
---|---|
author | Hung Huynh Liu Hui Ngoc Han Nguyen Ruixuan Qiao |
author_facet | Hung Huynh Liu Hui Ngoc Han Nguyen Ruixuan Qiao |
author_sort | Hung Huynh |
collection | DOAJ |
description | Diabetes is a chronic medical condition that has been causing severe public health challenges in not only Canada, but the entire world, for as long as time immemorial, impacting millions of people and putting pressure on healthcare resources. That said, conventional diagnostic procedures sometimes depend on few data points and are prone to mistakes, resulting in premature action. Additionally, the sluggish adoption of modern machine learning (ML) technologies in the healthcare industries might be due to their misunderstanding of the systems’ decision making procedures. This study purports to fill that gap by looking at various machine learning (ML) algorithms and applying them on the PIMA Indians Diabetes Dataset provided by the National Health Institute of Diabetes and Digestive and Kidney Diseases with the aim of improving the validity of diabetes prediction and diagnosis. Three types of machine learning classifiers are used: Tree-based, Function-based, and Rule-based. Results have shown that Stochastic Gradient Descent (function), Logistic Regression (function), JRip (rules) and Random Forests (trees) are among the top performing classifiers. They are judged based on different metrics, such as accuracy, precision, recall, specificity, F-1 score, MCC, and ROC area. Despite performing well in almost all of the metrics, SGD’s low recall score shows that it is not the most optimal algorithm. Given that recall score is prioritized in the context of clinical diagnostics, Random Forest emerges as a strong candidate due to its balanced performance across key metrics. |
format | Article |
id | doaj-art-67316a78c4214f6e9a12d3047c766d12 |
institution | Kabale University |
issn | 2816-9395 |
language | English |
publishDate | 2024-10-01 |
publisher | IJMADA |
record_format | Article |
series | International Journal of Management and Data Analytics |
spelling | doaj-art-67316a78c4214f6e9a12d3047c766d122025-01-20T15:45:31ZengIJMADAInternational Journal of Management and Data Analytics2816-93952024-10-0141435450Performance Analysis of Diabetes Detection Using Machine Learning ClassifiersHung Huynh0Liu Hui1Ngoc Han Nguyen2Ruixuan Qiao3University Canada WestUniversity Canada WestUniversity Canada WestUniversity Canada WestDiabetes is a chronic medical condition that has been causing severe public health challenges in not only Canada, but the entire world, for as long as time immemorial, impacting millions of people and putting pressure on healthcare resources. That said, conventional diagnostic procedures sometimes depend on few data points and are prone to mistakes, resulting in premature action. Additionally, the sluggish adoption of modern machine learning (ML) technologies in the healthcare industries might be due to their misunderstanding of the systems’ decision making procedures. This study purports to fill that gap by looking at various machine learning (ML) algorithms and applying them on the PIMA Indians Diabetes Dataset provided by the National Health Institute of Diabetes and Digestive and Kidney Diseases with the aim of improving the validity of diabetes prediction and diagnosis. Three types of machine learning classifiers are used: Tree-based, Function-based, and Rule-based. Results have shown that Stochastic Gradient Descent (function), Logistic Regression (function), JRip (rules) and Random Forests (trees) are among the top performing classifiers. They are judged based on different metrics, such as accuracy, precision, recall, specificity, F-1 score, MCC, and ROC area. Despite performing well in almost all of the metrics, SGD’s low recall score shows that it is not the most optimal algorithm. Given that recall score is prioritized in the context of clinical diagnostics, Random Forest emerges as a strong candidate due to its balanced performance across key metrics.https://ijmada.com/index.php/ijmada/article/view/50diabetes prediction and diagnosis, machine learning, classifiers, algorithms. |
spellingShingle | Hung Huynh Liu Hui Ngoc Han Nguyen Ruixuan Qiao Performance Analysis of Diabetes Detection Using Machine Learning Classifiers International Journal of Management and Data Analytics diabetes prediction and diagnosis, machine learning, classifiers, algorithms. |
title | Performance Analysis of Diabetes Detection Using Machine Learning Classifiers |
title_full | Performance Analysis of Diabetes Detection Using Machine Learning Classifiers |
title_fullStr | Performance Analysis of Diabetes Detection Using Machine Learning Classifiers |
title_full_unstemmed | Performance Analysis of Diabetes Detection Using Machine Learning Classifiers |
title_short | Performance Analysis of Diabetes Detection Using Machine Learning Classifiers |
title_sort | performance analysis of diabetes detection using machine learning classifiers |
topic | diabetes prediction and diagnosis, machine learning, classifiers, algorithms. |
url | https://ijmada.com/index.php/ijmada/article/view/50 |
work_keys_str_mv | AT hunghuynh performanceanalysisofdiabetesdetectionusingmachinelearningclassifiers AT liuhui performanceanalysisofdiabetesdetectionusingmachinelearningclassifiers AT ngochannguyen performanceanalysisofdiabetesdetectionusingmachinelearningclassifiers AT ruixuanqiao performanceanalysisofdiabetesdetectionusingmachinelearningclassifiers |