Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis

This study aimed to evaluate the quality and transparency of reporting in studies using machine learning (ML) in oncology, focusing on adherence to the Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Models (CREMLS), TRIPOD-AI (Transparent Reporting of a Multivariabl...

Full description

Saved in:
Bibliographic Details
Main Authors: Aref Smiley, David Villarreal-Zegarra, C. Mahony Reategui-Rivera, Stefan Escobar-Agreda, Joseph Finkelstein
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-04-01
Series:Frontiers in Oncology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fonc.2025.1555247/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850201613801619456
author Aref Smiley
David Villarreal-Zegarra
C. Mahony Reategui-Rivera
Stefan Escobar-Agreda
Joseph Finkelstein
author_facet Aref Smiley
David Villarreal-Zegarra
C. Mahony Reategui-Rivera
Stefan Escobar-Agreda
Joseph Finkelstein
author_sort Aref Smiley
collection DOAJ
description This study aimed to evaluate the quality and transparency of reporting in studies using machine learning (ML) in oncology, focusing on adherence to the Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Models (CREMLS), TRIPOD-AI (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis), and PROBAST (Prediction Model Risk of Bias Assessment Tool). The literature search included primary studies published between February 1, 2024, and January 31, 2025, that developed or tested ML models for cancer diagnosis, treatment, or prognosis. To reflect the current state of the rapidly evolving landscape of ML applications in oncology, fifteen most recent articles in each category were selected for evaluation. Two independent reviewers screened studies and extracted data on study characteristics, reporting quality (CREMLS and TRIPOD+AI), risk of bias (PROBAST), and ML performance metrics. The most frequently studied cancer types were breast cancer (n=7/45; 15.6%), lung cancer (n=7/45; 15.6%), and liver cancer (n=5/45; 11.1%). The findings indicate several deficiencies in reporting quality, as assessed by CREMLS and TRIPOD+AI. These deficiencies primarily relate to sample size calculation, reporting on data quality, strategies for handling outliers, documentation of ML model predictors, access to training or validation data, and reporting on model performance heterogeneity. The methodological quality assessment using PROBAST revealed that 89% of the included studies exhibited a low overall risk of bias, and all studies have shown a low risk of bias in terms of applicability. Regarding the specific AI models identified as the best-performing, Random Forest (RF) and XGBoost were the most frequently reported, each used in 17.8% of the studies (n = 8). Additionally, our study outlines the specific areas where reporting is deficient, providing researchers with guidance to improve reporting quality in these sections and, consequently, reduce the risk of bias in their studies.
format Article
id doaj-art-f68d139e002e431d9b15f83669c331b0
institution OA Journals
issn 2234-943X
language English
publishDate 2025-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Oncology
spelling doaj-art-f68d139e002e431d9b15f83669c331b02025-08-20T02:11:58ZengFrontiers Media S.A.Frontiers in Oncology2234-943X2025-04-011510.3389/fonc.2025.15552471555247Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosisAref Smiley0David Villarreal-Zegarra1C. Mahony Reategui-Rivera2Stefan Escobar-Agreda3Joseph Finkelstein4Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United StatesDepartment of Biomedical Informatics, University of Utah, Salt Lake City, UT, United StatesDepartment of Biomedical Informatics, University of Utah, Salt Lake City, UT, United StatesTelehealth Unit, Universidad Nacional Mayor de San Marcos, Lima, PeruDepartment of Biomedical Informatics, University of Utah, Salt Lake City, UT, United StatesThis study aimed to evaluate the quality and transparency of reporting in studies using machine learning (ML) in oncology, focusing on adherence to the Consolidated Reporting Guidelines for Prognostic and Diagnostic Machine Learning Models (CREMLS), TRIPOD-AI (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis), and PROBAST (Prediction Model Risk of Bias Assessment Tool). The literature search included primary studies published between February 1, 2024, and January 31, 2025, that developed or tested ML models for cancer diagnosis, treatment, or prognosis. To reflect the current state of the rapidly evolving landscape of ML applications in oncology, fifteen most recent articles in each category were selected for evaluation. Two independent reviewers screened studies and extracted data on study characteristics, reporting quality (CREMLS and TRIPOD+AI), risk of bias (PROBAST), and ML performance metrics. The most frequently studied cancer types were breast cancer (n=7/45; 15.6%), lung cancer (n=7/45; 15.6%), and liver cancer (n=5/45; 11.1%). The findings indicate several deficiencies in reporting quality, as assessed by CREMLS and TRIPOD+AI. These deficiencies primarily relate to sample size calculation, reporting on data quality, strategies for handling outliers, documentation of ML model predictors, access to training or validation data, and reporting on model performance heterogeneity. The methodological quality assessment using PROBAST revealed that 89% of the included studies exhibited a low overall risk of bias, and all studies have shown a low risk of bias in terms of applicability. Regarding the specific AI models identified as the best-performing, Random Forest (RF) and XGBoost were the most frequently reported, each used in 17.8% of the studies (n = 8). Additionally, our study outlines the specific areas where reporting is deficient, providing researchers with guidance to improve reporting quality in these sections and, consequently, reduce the risk of bias in their studies.https://www.frontiersin.org/articles/10.3389/fonc.2025.1555247/fullcancerartificial intelligencediagnosisprognosistherapy
spellingShingle Aref Smiley
David Villarreal-Zegarra
C. Mahony Reategui-Rivera
Stefan Escobar-Agreda
Joseph Finkelstein
Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis
Frontiers in Oncology
cancer
artificial intelligence
diagnosis
prognosis
therapy
title Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis
title_full Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis
title_fullStr Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis
title_full_unstemmed Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis
title_short Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis
title_sort methodological and reporting quality of machine learning studies on cancer diagnosis treatment and prognosis
topic cancer
artificial intelligence
diagnosis
prognosis
therapy
url https://www.frontiersin.org/articles/10.3389/fonc.2025.1555247/full
work_keys_str_mv AT arefsmiley methodologicalandreportingqualityofmachinelearningstudiesoncancerdiagnosistreatmentandprognosis
AT davidvillarrealzegarra methodologicalandreportingqualityofmachinelearningstudiesoncancerdiagnosistreatmentandprognosis
AT cmahonyreateguirivera methodologicalandreportingqualityofmachinelearningstudiesoncancerdiagnosistreatmentandprognosis
AT stefanescobaragreda methodologicalandreportingqualityofmachinelearningstudiesoncancerdiagnosistreatmentandprognosis
AT josephfinkelstein methodologicalandreportingqualityofmachinelearningstudiesoncancerdiagnosistreatmentandprognosis