Key factors in predictive analysis of cardiovascular risks in public health

Abstract This research emphasizes the role of analytics in evaluating the risk of disease (CVD) focusing on thorough data preparation and feature engineering for accurate predictions. We studied machine learning (ML) and learning (DL) models, such as Logistic Regression (LR) Random Forest (RF) Gradi...

Full description

Saved in:
Bibliographic Details
Main Authors: Ghazi I. Al Jowf, Manjur Kolhar
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-07874-x
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract This research emphasizes the role of analytics in evaluating the risk of disease (CVD) focusing on thorough data preparation and feature engineering for accurate predictions. We studied machine learning (ML) and learning (DL) models, such as Logistic Regression (LR) Random Forest (RF) Gradient Boosting Machines (GBM) and Multilayer Perceptron (MLP). Each model’s performance was assessed using metrics like accuracy, precision, recall, F1 score and ROC AUC to determine their reliability and practical relevance. Our analysis shows the strengths of each model category. Conventional ML models like Random Forest and Gradient Boosting Machines were effective in identifying patients at risk achieving up to 74% accuracy and 72% recall. On the hand, deep learning models like Multilayer Perceptron excelled in handling data with an impressive ROC AUC score of approximately 80%. Despite the need for resources and extensive data preprocessing these models are highly skilled at pinpointing crucial risk factors, crucial, for long term CVD management.
ISSN:2045-2322