Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study

BackgroundTo reduce the mortality related to bladder cancer, efforts need to be concentrated on early detection of the disease for more effective therapeutic intervention. Strong risk factors (eg, smoking status, age, professional exposure) have been identified, and some diag...

Full description

Saved in:
Bibliographic Details
Main Authors: Sandie Cabon, Sarra Brihi, Riadh Fezzani, Morgane Pierre-Jean, Marc Cuggia, Guillaume Bouzillé
Format: Article
Language:English
Published: JMIR Publications 2025-01-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2025/1/e56946
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832591153559502848
author Sandie Cabon
Sarra Brihi
Riadh Fezzani
Morgane Pierre-Jean
Marc Cuggia
Guillaume Bouzillé
author_facet Sandie Cabon
Sarra Brihi
Riadh Fezzani
Morgane Pierre-Jean
Marc Cuggia
Guillaume Bouzillé
author_sort Sandie Cabon
collection DOAJ
description BackgroundTo reduce the mortality related to bladder cancer, efforts need to be concentrated on early detection of the disease for more effective therapeutic intervention. Strong risk factors (eg, smoking status, age, professional exposure) have been identified, and some diagnostic tools (eg, by way of cystoscopy) have been proposed. However, to date, no fully satisfactory (noninvasive, inexpensive, high-performance) solution for widespread deployment has been proposed. Some new models based on cytology image classification were recently developed and bring good perspectives, but there are still avenues to explore to improve their performance. ObjectiveOur team aimed to evaluate the benefit of combining the reuse of massive clinical data to build a risk factor model and a digital cytology image–based model (VisioCyt) for bladder cancer detection. MethodsThe first step relied on designing a predictive model based on clinical data (ie, risk factors identified in the literature) extracted from the clinical data warehouse of the Rennes Hospital and machine learning algorithms (logistic regression, random forest, and support vector machine). It provides a score corresponding to the risk of developing bladder cancer based on the patient’s clinical profile. Second, we investigated 3 strategies (logistic regression, decision tree, and a custom strategy based on score interpretation) to combine the model’s score with the score from an image-based model to produce a robust bladder cancer scoring system. ResultsWe collected 2 data sets. The first set, including clinical data for 5422 patients extracted from the clinical data warehouse, was used to design the risk factor–based model. The second set was used to measure the models’ performances and was composed of data for 620 patients from a clinical trial for which cytology images and clinicobiological features were collected. With this second data set, the combination of both models obtained areas under the curve of 0.82 on the training set and 0.83 on the test set, demonstrating the value of combining risk factor–based and image-based models. This combination offers a higher associated risk of cancer than VisioCyt alone for all classes, especially for low-grade bladder cancer. ConclusionsThese results demonstrate the value of combining clinical and biological information, especially to improve detection of low-grade bladder cancer. Some improvements will need to be made to the automatic extraction of clinical features to make the risk factor–based model more robust. However, as of now, the results support the assumption that this type of approach will be of benefit to patients.
format Article
id doaj-art-2c3f5ba948f744c7af1396e790e84397
institution Kabale University
issn 1438-8871
language English
publishDate 2025-01-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj-art-2c3f5ba948f744c7af1396e790e843972025-01-22T21:00:38ZengJMIR PublicationsJournal of Medical Internet Research1438-88712025-01-0127e5694610.2196/56946Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept StudySandie Cabonhttps://orcid.org/0000-0001-7847-0916Sarra Brihihttps://orcid.org/0009-0009-7328-2700Riadh Fezzanihttps://orcid.org/0009-0003-6667-9124Morgane Pierre-Jeanhttps://orcid.org/0000-0002-9133-780XMarc Cuggiahttps://orcid.org/0000-0001-6943-3937Guillaume Bouzilléhttps://orcid.org/0000-0002-3637-6558 BackgroundTo reduce the mortality related to bladder cancer, efforts need to be concentrated on early detection of the disease for more effective therapeutic intervention. Strong risk factors (eg, smoking status, age, professional exposure) have been identified, and some diagnostic tools (eg, by way of cystoscopy) have been proposed. However, to date, no fully satisfactory (noninvasive, inexpensive, high-performance) solution for widespread deployment has been proposed. Some new models based on cytology image classification were recently developed and bring good perspectives, but there are still avenues to explore to improve their performance. ObjectiveOur team aimed to evaluate the benefit of combining the reuse of massive clinical data to build a risk factor model and a digital cytology image–based model (VisioCyt) for bladder cancer detection. MethodsThe first step relied on designing a predictive model based on clinical data (ie, risk factors identified in the literature) extracted from the clinical data warehouse of the Rennes Hospital and machine learning algorithms (logistic regression, random forest, and support vector machine). It provides a score corresponding to the risk of developing bladder cancer based on the patient’s clinical profile. Second, we investigated 3 strategies (logistic regression, decision tree, and a custom strategy based on score interpretation) to combine the model’s score with the score from an image-based model to produce a robust bladder cancer scoring system. ResultsWe collected 2 data sets. The first set, including clinical data for 5422 patients extracted from the clinical data warehouse, was used to design the risk factor–based model. The second set was used to measure the models’ performances and was composed of data for 620 patients from a clinical trial for which cytology images and clinicobiological features were collected. With this second data set, the combination of both models obtained areas under the curve of 0.82 on the training set and 0.83 on the test set, demonstrating the value of combining risk factor–based and image-based models. This combination offers a higher associated risk of cancer than VisioCyt alone for all classes, especially for low-grade bladder cancer. ConclusionsThese results demonstrate the value of combining clinical and biological information, especially to improve detection of low-grade bladder cancer. Some improvements will need to be made to the automatic extraction of clinical features to make the risk factor–based model more robust. However, as of now, the results support the assumption that this type of approach will be of benefit to patients.https://www.jmir.org/2025/1/e56946
spellingShingle Sandie Cabon
Sarra Brihi
Riadh Fezzani
Morgane Pierre-Jean
Marc Cuggia
Guillaume Bouzillé
Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study
Journal of Medical Internet Research
title Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study
title_full Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study
title_fullStr Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study
title_full_unstemmed Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study
title_short Combining a Risk Factor Score Designed From Electronic Health Records With a Digital Cytology Image Scoring System to Improve Bladder Cancer Detection: Proof-of-Concept Study
title_sort combining a risk factor score designed from electronic health records with a digital cytology image scoring system to improve bladder cancer detection proof of concept study
url https://www.jmir.org/2025/1/e56946
work_keys_str_mv AT sandiecabon combiningariskfactorscoredesignedfromelectronichealthrecordswithadigitalcytologyimagescoringsystemtoimprovebladdercancerdetectionproofofconceptstudy
AT sarrabrihi combiningariskfactorscoredesignedfromelectronichealthrecordswithadigitalcytologyimagescoringsystemtoimprovebladdercancerdetectionproofofconceptstudy
AT riadhfezzani combiningariskfactorscoredesignedfromelectronichealthrecordswithadigitalcytologyimagescoringsystemtoimprovebladdercancerdetectionproofofconceptstudy
AT morganepierrejean combiningariskfactorscoredesignedfromelectronichealthrecordswithadigitalcytologyimagescoringsystemtoimprovebladdercancerdetectionproofofconceptstudy
AT marccuggia combiningariskfactorscoredesignedfromelectronichealthrecordswithadigitalcytologyimagescoringsystemtoimprovebladdercancerdetectionproofofconceptstudy
AT guillaumebouzille combiningariskfactorscoredesignedfromelectronichealthrecordswithadigitalcytologyimagescoringsystemtoimprovebladdercancerdetectionproofofconceptstudy