Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis
The study aims to accurately predict the presence of heart disease using machine learning models. The research evaluates and compares the performance of five algorithms - Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosting - on a dataset containing...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Peoples’ Friendship University of Russia (RUDN University)
2025-07-01
|
| Series: | RUDN Journal of Engineering Research |
| Subjects: | |
| Online Access: | https://journals.rudn.ru/engineering-researches/article/viewFile/45012/25005 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849716413631037440 |
|---|---|
| author | Aiym B. Temirbayeva Arshyn Altybay |
| author_facet | Aiym B. Temirbayeva Arshyn Altybay |
| author_sort | Aiym B. Temirbayeva |
| collection | DOAJ |
| description | The study aims to accurately predict the presence of heart disease using machine learning models. The research evaluates and compares the performance of five algorithms - Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosting - on a dataset containing clinical features of patients. The primary research question is to identify which algorithm demonstrates the best predictive performance for heart disease diagnosis. The study used a dataset of 270 patients with 13 clinical features. The data was preprocessed, and target variables were converted into binary values for classification. The dataset was split into training and test sets in a 70-30 ratio. Five machine learning models were trained and evaluated using metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Confusion matrices were analyzed to gain additional insights into model performance. Logistic Regression and Random Forest showed the best results among all models, with an accuracy of 86.4 and 80.2%, respectively. The Logistic Regression showed a ROC-AUC score of 0.844, while the Random Forest showed a score of 0.88. The confusion matrices revealed the strengths and weaknesses of each model in terms of forecasting. Logistic Regression and Random Forest were identified as the most reliable models for predicting heart disease in this dataset. Future work will explore hyperparameter tuning and ensemble methods to further enhance model performance, providing valuable insights for early diagnosis and treatment of cardiovascular diseases. |
| format | Article |
| id | doaj-art-0dbbbcc107a9444faa9b2e01ac2b2181 |
| institution | DOAJ |
| issn | 2312-8143 2312-8151 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Peoples’ Friendship University of Russia (RUDN University) |
| record_format | Article |
| series | RUDN Journal of Engineering Research |
| spelling | doaj-art-0dbbbcc107a9444faa9b2e01ac2b21812025-08-20T03:13:00ZengPeoples’ Friendship University of Russia (RUDN University)RUDN Journal of Engineering Research2312-81432312-81512025-07-0126216818010.22363/2312-8143-2025-26-2-168-18021159Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative AnalysisAiym B. Temirbayeva0https://orcid.org/0009-0003-6131-2884Arshyn Altybay1https://orcid.org/0000-0003-4939-8876Astana IT UniversityAstana IT UniversityThe study aims to accurately predict the presence of heart disease using machine learning models. The research evaluates and compares the performance of five algorithms - Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, and Gradient Boosting - on a dataset containing clinical features of patients. The primary research question is to identify which algorithm demonstrates the best predictive performance for heart disease diagnosis. The study used a dataset of 270 patients with 13 clinical features. The data was preprocessed, and target variables were converted into binary values for classification. The dataset was split into training and test sets in a 70-30 ratio. Five machine learning models were trained and evaluated using metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Confusion matrices were analyzed to gain additional insights into model performance. Logistic Regression and Random Forest showed the best results among all models, with an accuracy of 86.4 and 80.2%, respectively. The Logistic Regression showed a ROC-AUC score of 0.844, while the Random Forest showed a score of 0.88. The confusion matrices revealed the strengths and weaknesses of each model in terms of forecasting. Logistic Regression and Random Forest were identified as the most reliable models for predicting heart disease in this dataset. Future work will explore hyperparameter tuning and ensemble methods to further enhance model performance, providing valuable insights for early diagnosis and treatment of cardiovascular diseases.https://journals.rudn.ru/engineering-researches/article/viewFile/45012/25005random forestsupport vector machinegradient boostingdecision treelogistic regressionaccuracy |
| spellingShingle | Aiym B. Temirbayeva Arshyn Altybay Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis RUDN Journal of Engineering Research random forest support vector machine gradient boosting decision tree logistic regression accuracy |
| title | Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis |
| title_full | Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis |
| title_fullStr | Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis |
| title_full_unstemmed | Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis |
| title_short | Machine Learning Methods for Predicting Cardiovascular Diseases: A Comparative Analysis |
| title_sort | machine learning methods for predicting cardiovascular diseases a comparative analysis |
| topic | random forest support vector machine gradient boosting decision tree logistic regression accuracy |
| url | https://journals.rudn.ru/engineering-researches/article/viewFile/45012/25005 |
| work_keys_str_mv | AT aiymbtemirbayeva machinelearningmethodsforpredictingcardiovasculardiseasesacomparativeanalysis AT arshynaltybay machinelearningmethodsforpredictingcardiovasculardiseasesacomparativeanalysis |