Development and validation of machine learning classifiers for predicting treatment-needed retinopathy of prematurity
Abstract Background This study aims to design and evaluate various supervised machine-learning models for identifying premature infants who require treatment based on demographic data and clinical findings from screening examinations. Methods We conducted a retrospective review of medical records fo...
Saved in:
| Main Authors: | , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-07-01
|
| Series: | BMC Medical Informatics and Decision Making |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12911-025-03057-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Background This study aims to design and evaluate various supervised machine-learning models for identifying premature infants who require treatment based on demographic data and clinical findings from screening examinations. Methods We conducted a retrospective review of medical records for infants screened for retinopathy of prematurity (ROP) at our clinic over the past decade. We extracted demographic and clinical data, including eleven features: sex, maternal education, paternal education, birth weight, gestational age, ROP stage, zone of retinal involvement, age at examination, weight at examination, and CPR. We developed and assessed several classifiers: logistic regression (LR), decision tree (DT), support vector machine (SVM), naïve Bayes (NB), K-nearest neighbors (KNN), XGBoost, artificial neural networks (ANN), and random forest (RF). The target variable was defined as whether the neonate received any treatment during the follow-up period. Results Our analysis included data from 9,692 infants. Among the machine learning models evaluated, the XGBoost and ANN models achieved the highest accuracy at 96%. In terms of sensitivity (recall), the NB model exhibited the lowest false negative rate, indicating the highest sensitivity (0.99). In the context of premature neonates, accurately diagnosing those who require treatment is crucial. Therefore, from a clinical perspective, prioritizing a model with the lowest false negative rate may be more beneficial than selecting one based solely on the highest accuracy. Conclusion While AI can enhance decision-making processes by providing real-time risk assessments, these tools must be used to augment—not replace—clinical judgment. Clinicians must remain involved in interpreting model outputs and making final treatment decisions based on a holistic understanding of each patient’s unique circumstances. Clinical trial number Not applicable. |
|---|---|
| ISSN: | 1472-6947 |