Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context

Summary: Background: High-risk human papillomavirus (hrHPV) full genotyping facilitates risk stratification and efficiency in cervical cancer screening, widely verified and adopted in various screening settings. We aimed develop a cervical cancer predictive model that can guide referrals for colpos...

Full description

Saved in:
Bibliographic Details
Main Authors: Binhua Dong, Zhen Lu, Tianjie Yang, Junfeng Wang, Yan Zhang, Xunyuan Tuo, Juntao Wang, Shaomei Lin, Hongning Cai, Huan Cheng, Xiaoli Cao, Xinxin Huang, Zheng Zheng, Chong Miao, Yue Wang, Huifeng Xue, Shuxia Xu, Xianhua Liu, Huachun Zou, Pengming Sun
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:The Lancet Regional Health. Western Pacific
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666606525000173
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832585090788491264
author Binhua Dong
Zhen Lu
Tianjie Yang
Junfeng Wang
Yan Zhang
Xunyuan Tuo
Juntao Wang
Shaomei Lin
Hongning Cai
Huan Cheng
Xiaoli Cao
Xinxin Huang
Zheng Zheng
Chong Miao
Yue Wang
Huifeng Xue
Shuxia Xu
Xianhua Liu
Huachun Zou
Pengming Sun
author_facet Binhua Dong
Zhen Lu
Tianjie Yang
Junfeng Wang
Yan Zhang
Xunyuan Tuo
Juntao Wang
Shaomei Lin
Hongning Cai
Huan Cheng
Xiaoli Cao
Xinxin Huang
Zheng Zheng
Chong Miao
Yue Wang
Huifeng Xue
Shuxia Xu
Xianhua Liu
Huachun Zou
Pengming Sun
author_sort Binhua Dong
collection DOAJ
description Summary: Background: High-risk human papillomavirus (hrHPV) full genotyping facilitates risk stratification and efficiency in cervical cancer screening, widely verified and adopted in various screening settings. We aimed develop a cervical cancer predictive model that can guide referrals for colposcopy using hrHPV full genotyping data in a setting where screening rate is low. Methods: We developed, compared and validated four machine learning models (eXtreme gradient boosting [XGBoost], support vector machine [SVM], random forest [RF], and naïve bayes [NB]) for cervical cancer prediction, using data from a national cervical cancer screening project conducted in 267 healthcare centers in China. Cervical intraepithelial neoplasia grade 2 or worse (CIN2+) and CIN3+ were the primary and secondary outcomes. In various screening settings across China, the performance of discrimination was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, area under the precision–recall curve (AUPRC), and accuracy. Calibration and clinical utility were assessed with brier score, calibration curve and decision curve analysis (DCA). Findings: 1,112,846 women were recruited, of whom 599,043 were included in the analysis based on hrHPV full genotyping. Of these, 254,434 (age [years, median, IQR]: 48, 42–54), 297,479 (49, 43–55), 38,500 (37, 32–44), 1950 (38, 33–46), 1590 (53, 47–58), 779 (38, 31–49) and 4311 (40, 33–50) were in the development, temporal validation and external validation 1–5 datasets, respectively. The final simplified clinical risk prediction model includes hrHPV, number of HPV genotypes, cervical cytology, HPV16, HPV18, age, HPV52, HPV39 and gynecological examination. The final optimal XGBoost model for predicting CIN2+ showed good discrimination (AUROC, maximum 0.989 [0.987–0.992]; minimum 0.781 [0.74–0.819]), and calibration (brier score, maximum 0.118 [0.099–0.137]) in the five external validation sets. DCA showed that when the clinical decision threshold probability for optimal XGBoost model was less than 0.80, the model for predicting CIN2+ provided a superior standardized net benefit. The optimal XGBoost model obtained similar results in predicting CIN3+. Interpretation: We developed a cervical cancer screening risk prediction model that employs hrHPV full genotyping and simple test results to achieve risk prediction and stratified management for colposcopy referrals. This predictive tool is particularly suitable for settings with low screening rates. Funding: National Natural Science Foundation of China; Major Scientific Research Program for Young and Middle-aged Health Professionals of Fujian Province, China; Fujian Province Central Government-Guided Local Science and Technology Development Project; Fujian Province's Third Batch of Flexible Introduction of High-Level Medical Talent Teams; Fujian Provincial Natural Science Foundation of China; Fujian Provincial Science and Technology Innovation Joint Fund.
format Article
id doaj-art-b16ae52b9dc247ba9318aca27a92d7d9
institution Kabale University
issn 2666-6065
language English
publishDate 2025-02-01
publisher Elsevier
record_format Article
series The Lancet Regional Health. Western Pacific
spelling doaj-art-b16ae52b9dc247ba9318aca27a92d7d92025-01-27T04:22:24ZengElsevierThe Lancet Regional Health. Western Pacific2666-60652025-02-0155101480Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in contextBinhua Dong0Zhen Lu1Tianjie Yang2Junfeng Wang3Yan Zhang4Xunyuan Tuo5Juntao Wang6Shaomei Lin7Hongning Cai8Huan Cheng9Xiaoli Cao10Xinxin Huang11Zheng Zheng12Chong Miao13Yue Wang14Huifeng Xue15Shuxia Xu16Xianhua Liu17Huachun Zou18Pengming Sun19Department of Gynecology, Fujian Key Laboratory of Women and Children's Critical Diseases Research, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, China; Fujian Clinical Research Center for Gynecological Oncology, Fuzhou, Fujian, ChinaSchool of Public Health (Shenzhen), Sun Yat-sen University, Shenzhen, Guangdong, ChinaDepartment of Gynecology, Shenzhen Maternity & Child Healthcare Hospital, Shenzhen, Guangdong, ChinaDivision of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, the NetherlandsThe State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, National Institute of Diagnostics and Vaccine Development in Infectious Diseases, School of Public Health, Xiamen University, Xiamen, Fujian, China; Department of Gynecology, Fujian Key Laboratory of Women and Children's Critical Diseases Research, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Gynecology, Gansu Provincial Maternity & Child Health-care Hospital, Lanzhou, Ganshu, ChinaDepartment of Gynecology, Guiyang Maternal and Child Health Care Hospital, Guiyang, Guizhou, ChinaDepartment of Gynecology, Shunde Women's and Children's Hospital of Guangdong Medical University, Foshan, Guangdong, ChinaDepartment of Hubei Clinical Medical Research Center for Gynecologic Malignancy, Maternal and Child Health Hospital of Hubei Province (Women and Children's Hospital of Hubei Province), Wuhan, Hubei, ChinaDepartment of Gynecology, Maternal and Child Health Hospital of Hongan County, Huanggang, Hubei, ChinaDepartment of Gynecology, Maternal and Child Health Hospital of Gongan County, Jingzhou, Hubei, ChinaThe Ministry of Health, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Gynecology, Shenzhen Maternity & Child Healthcare Hospital, Shenzhen, Guangdong, ChinaDepartment of Information, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Gynecology, Fujian Key Laboratory of Women and Children's Critical Diseases Research, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, China; Fujian Clinical Research Center for Gynecological Oncology, Fuzhou, Fujian, ChinaCenter for Cervical Disease Diagnosis and Treatment, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Pathology, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, ChinaDepartment of Pathology, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, ChinaSchool of Public Health, Fudan University, Shanghai, China; Department of Gynecology, Fujian Key Laboratory of Women and Children's Critical Diseases Research, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, China; Corresponding author. School of Public Health, Fudan University, 130 Dongan Road, Xuhui District, Shanghai, 200032, PR China.Department of Gynecology, Fujian Key Laboratory of Women and Children's Critical Diseases Research, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, China; Fujian Clinical Research Center for Gynecological Oncology, Fuzhou, Fujian, China; School of Population Medicine and Public Health, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China; Corresponding author. Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, 18 Daoshan Road, Fuzhou, Fujian 350001, PR China.Summary: Background: High-risk human papillomavirus (hrHPV) full genotyping facilitates risk stratification and efficiency in cervical cancer screening, widely verified and adopted in various screening settings. We aimed develop a cervical cancer predictive model that can guide referrals for colposcopy using hrHPV full genotyping data in a setting where screening rate is low. Methods: We developed, compared and validated four machine learning models (eXtreme gradient boosting [XGBoost], support vector machine [SVM], random forest [RF], and naïve bayes [NB]) for cervical cancer prediction, using data from a national cervical cancer screening project conducted in 267 healthcare centers in China. Cervical intraepithelial neoplasia grade 2 or worse (CIN2+) and CIN3+ were the primary and secondary outcomes. In various screening settings across China, the performance of discrimination was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, area under the precision–recall curve (AUPRC), and accuracy. Calibration and clinical utility were assessed with brier score, calibration curve and decision curve analysis (DCA). Findings: 1,112,846 women were recruited, of whom 599,043 were included in the analysis based on hrHPV full genotyping. Of these, 254,434 (age [years, median, IQR]: 48, 42–54), 297,479 (49, 43–55), 38,500 (37, 32–44), 1950 (38, 33–46), 1590 (53, 47–58), 779 (38, 31–49) and 4311 (40, 33–50) were in the development, temporal validation and external validation 1–5 datasets, respectively. The final simplified clinical risk prediction model includes hrHPV, number of HPV genotypes, cervical cytology, HPV16, HPV18, age, HPV52, HPV39 and gynecological examination. The final optimal XGBoost model for predicting CIN2+ showed good discrimination (AUROC, maximum 0.989 [0.987–0.992]; minimum 0.781 [0.74–0.819]), and calibration (brier score, maximum 0.118 [0.099–0.137]) in the five external validation sets. DCA showed that when the clinical decision threshold probability for optimal XGBoost model was less than 0.80, the model for predicting CIN2+ provided a superior standardized net benefit. The optimal XGBoost model obtained similar results in predicting CIN3+. Interpretation: We developed a cervical cancer screening risk prediction model that employs hrHPV full genotyping and simple test results to achieve risk prediction and stratified management for colposcopy referrals. This predictive tool is particularly suitable for settings with low screening rates. Funding: National Natural Science Foundation of China; Major Scientific Research Program for Young and Middle-aged Health Professionals of Fujian Province, China; Fujian Province Central Government-Guided Local Science and Technology Development Project; Fujian Province's Third Batch of Flexible Introduction of High-Level Medical Talent Teams; Fujian Provincial Natural Science Foundation of China; Fujian Provincial Science and Technology Innovation Joint Fund.http://www.sciencedirect.com/science/article/pii/S2666606525000173Prediction modelCervical cancerHuman papillomavirusFull genotypingMachine learningChina
spellingShingle Binhua Dong
Zhen Lu
Tianjie Yang
Junfeng Wang
Yan Zhang
Xunyuan Tuo
Juntao Wang
Shaomei Lin
Hongning Cai
Huan Cheng
Xiaoli Cao
Xinxin Huang
Zheng Zheng
Chong Miao
Yue Wang
Huifeng Xue
Shuxia Xu
Xianhua Liu
Huachun Zou
Pengming Sun
Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context
The Lancet Regional Health. Western Pacific
Prediction model
Cervical cancer
Human papillomavirus
Full genotyping
Machine learning
China
title Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context
title_full Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context
title_fullStr Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context
title_full_unstemmed Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context
title_short Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context
title_sort development validation and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full genotyping hrhpv test smart hpv a modelling studyresearch in context
topic Prediction model
Cervical cancer
Human papillomavirus
Full genotyping
Machine learning
China
url http://www.sciencedirect.com/science/article/pii/S2666606525000173
work_keys_str_mv AT binhuadong developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT zhenlu developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT tianjieyang developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT junfengwang developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT yanzhang developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT xunyuantuo developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT juntaowang developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT shaomeilin developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT hongningcai developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT huancheng developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT xiaolicao developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT xinxinhuang developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT zhengzheng developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT chongmiao developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT yuewang developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT huifengxue developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT shuxiaxu developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT xianhualiu developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT huachunzou developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext
AT pengmingsun developmentvalidationandclinicalapplicationofamachinelearningmodelforriskstratificationandmanagementofcervicalcancerscreeningbasedonfullgenotypinghrhpvtestsmarthpvamodellingstudyresearchincontext