Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches

Abstract Background The aim of this study was to evaluate the potential models to determine the most important anthropometric factors associated with type 2 diabetes mellitus (T2DM). Method A dataset derived from the Mashhad Stroke and heart atherosclerotic disorders (MASHAD) study comprising 9354 s...

Full description

Saved in:
Bibliographic Details
Main Authors: Nafiseh Hosseini, Hamid Tanzadehpanah, Amin Mansoori, Mostafa Sabzekar, Gordon A. Ferns, Habibollah Esmaily, Majid Ghayour-Mobarhan
Format: Article
Language:English
Published: BMC 2025-01-01
Series:BMC Medical Informatics and Decision Making
Online Access:https://doi.org/10.1186/s12911-025-02887-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571594510172160
author Nafiseh Hosseini
Hamid Tanzadehpanah
Amin Mansoori
Mostafa Sabzekar
Gordon A. Ferns
Habibollah Esmaily
Majid Ghayour-Mobarhan
author_facet Nafiseh Hosseini
Hamid Tanzadehpanah
Amin Mansoori
Mostafa Sabzekar
Gordon A. Ferns
Habibollah Esmaily
Majid Ghayour-Mobarhan
author_sort Nafiseh Hosseini
collection DOAJ
description Abstract Background The aim of this study was to evaluate the potential models to determine the most important anthropometric factors associated with type 2 diabetes mellitus (T2DM). Method A dataset derived from the Mashhad Stroke and heart atherosclerotic disorders (MASHAD) study comprising 9354 subject aged 65 − 35. 25% (2336 people) of subjects were diabetic and 75% (7018 people) where non-diabetic was used for the analysis of 10 anthropometric factors and age that were measured in all patients. A K-nearest neighbor (KNN) model was used to assess the association between T2DM and selected factors. The model was evaluated using accuracy, sensitivity, specificity, precision and f1-measure parameters. The receiver operating characteristic (ROC) curve and factor importance analysis were also determined. The performance of the KNN model was compared with Artificial neural network (ANN) and support vector machine (SVM) models. Result After feature selection analysis and assessing multicollinearity, six factors (Mid-arm Circumference (MAC), Waist Circumference (WC), Body Roundness Index (BRI), Body Adiposity Index (BAI), Body Mass Index (BMI), age) were used in the final model. BRI, BAI and MAC factors in males and BMI, BRI, and MAC factors in females were found to have the greatest association with T2DM. The accuracy of the KNN model was approximately 93% for both genders. The best K (number of neighbors) for the model was 4 which had the lowest error rate. The area under the ROC curve (AUC) was 0.985 for men and 0.986 for women. The KNN model achieved the best result of the models explored. Conclusion The KNN model had a high accuracy (93%) for predicting the association between anthropometric factors and T2DM. Selecting the K parameter (nearest neighbor) has an essential impact on reducing the error rate. Feature selection analysis reduces the dimensions of the KNN model and increases the accuracy of final results.
format Article
id doaj-art-8184144786cc47fb844d26ba9807d5a2
institution Kabale University
issn 1472-6947
language English
publishDate 2025-01-01
publisher BMC
record_format Article
series BMC Medical Informatics and Decision Making
spelling doaj-art-8184144786cc47fb844d26ba9807d5a22025-02-02T12:27:45ZengBMCBMC Medical Informatics and Decision Making1472-69472025-01-0125111010.1186/s12911-025-02887-yUsing a robust model to detect the association between anthropometric factors and T2DM: machine learning approachesNafiseh Hosseini0Hamid Tanzadehpanah1Amin Mansoori2Mostafa Sabzekar3Gordon A. Ferns4Habibollah Esmaily5Majid Ghayour-Mobarhan6International UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical SciencesAntimicrobial Resistance Research Center, Mashhad University of Medical SciencesDepartment of Applied Mathematics, School of Mathematical Sciences, Ferdowsi University of MashhadDepartment of Computer Engineering, Birjand University of TechnologyBrighton and Sussex Medical School, Division of Medical EducationDepartment of Biostatistics, School of Health, Mashhad University of Medical SciencesInternational UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical SciencesAbstract Background The aim of this study was to evaluate the potential models to determine the most important anthropometric factors associated with type 2 diabetes mellitus (T2DM). Method A dataset derived from the Mashhad Stroke and heart atherosclerotic disorders (MASHAD) study comprising 9354 subject aged 65 − 35. 25% (2336 people) of subjects were diabetic and 75% (7018 people) where non-diabetic was used for the analysis of 10 anthropometric factors and age that were measured in all patients. A K-nearest neighbor (KNN) model was used to assess the association between T2DM and selected factors. The model was evaluated using accuracy, sensitivity, specificity, precision and f1-measure parameters. The receiver operating characteristic (ROC) curve and factor importance analysis were also determined. The performance of the KNN model was compared with Artificial neural network (ANN) and support vector machine (SVM) models. Result After feature selection analysis and assessing multicollinearity, six factors (Mid-arm Circumference (MAC), Waist Circumference (WC), Body Roundness Index (BRI), Body Adiposity Index (BAI), Body Mass Index (BMI), age) were used in the final model. BRI, BAI and MAC factors in males and BMI, BRI, and MAC factors in females were found to have the greatest association with T2DM. The accuracy of the KNN model was approximately 93% for both genders. The best K (number of neighbors) for the model was 4 which had the lowest error rate. The area under the ROC curve (AUC) was 0.985 for men and 0.986 for women. The KNN model achieved the best result of the models explored. Conclusion The KNN model had a high accuracy (93%) for predicting the association between anthropometric factors and T2DM. Selecting the K parameter (nearest neighbor) has an essential impact on reducing the error rate. Feature selection analysis reduces the dimensions of the KNN model and increases the accuracy of final results.https://doi.org/10.1186/s12911-025-02887-y
spellingShingle Nafiseh Hosseini
Hamid Tanzadehpanah
Amin Mansoori
Mostafa Sabzekar
Gordon A. Ferns
Habibollah Esmaily
Majid Ghayour-Mobarhan
Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches
BMC Medical Informatics and Decision Making
title Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches
title_full Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches
title_fullStr Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches
title_full_unstemmed Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches
title_short Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches
title_sort using a robust model to detect the association between anthropometric factors and t2dm machine learning approaches
url https://doi.org/10.1186/s12911-025-02887-y
work_keys_str_mv AT nafisehhosseini usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches
AT hamidtanzadehpanah usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches
AT aminmansoori usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches
AT mostafasabzekar usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches
AT gordonaferns usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches
AT habibollahesmaily usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches
AT majidghayourmobarhan usingarobustmodeltodetecttheassociationbetweenanthropometricfactorsandt2dmmachinelearningapproaches