Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling studyResearch in context

Summary: Background: High-risk human papillomavirus (hrHPV) full genotyping facilitates risk stratification and efficiency in cervical cancer screening, widely verified and adopted in various screening settings. We aimed develop a cervical cancer predictive model that can guide referrals for colpos...

Full description

Saved in:
Bibliographic Details
Main Authors: Binhua Dong, Zhen Lu, Tianjie Yang, Junfeng Wang, Yan Zhang, Xunyuan Tuo, Juntao Wang, Shaomei Lin, Hongning Cai, Huan Cheng, Xiaoli Cao, Xinxin Huang, Zheng Zheng, Chong Miao, Yue Wang, Huifeng Xue, Shuxia Xu, Xianhua Liu, Huachun Zou, Pengming Sun
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:The Lancet Regional Health. Western Pacific
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666606525000173
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary: Background: High-risk human papillomavirus (hrHPV) full genotyping facilitates risk stratification and efficiency in cervical cancer screening, widely verified and adopted in various screening settings. We aimed develop a cervical cancer predictive model that can guide referrals for colposcopy using hrHPV full genotyping data in a setting where screening rate is low. Methods: We developed, compared and validated four machine learning models (eXtreme gradient boosting [XGBoost], support vector machine [SVM], random forest [RF], and naïve bayes [NB]) for cervical cancer prediction, using data from a national cervical cancer screening project conducted in 267 healthcare centers in China. Cervical intraepithelial neoplasia grade 2 or worse (CIN2+) and CIN3+ were the primary and secondary outcomes. In various screening settings across China, the performance of discrimination was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, area under the precision–recall curve (AUPRC), and accuracy. Calibration and clinical utility were assessed with brier score, calibration curve and decision curve analysis (DCA). Findings: 1,112,846 women were recruited, of whom 599,043 were included in the analysis based on hrHPV full genotyping. Of these, 254,434 (age [years, median, IQR]: 48, 42–54), 297,479 (49, 43–55), 38,500 (37, 32–44), 1950 (38, 33–46), 1590 (53, 47–58), 779 (38, 31–49) and 4311 (40, 33–50) were in the development, temporal validation and external validation 1–5 datasets, respectively. The final simplified clinical risk prediction model includes hrHPV, number of HPV genotypes, cervical cytology, HPV16, HPV18, age, HPV52, HPV39 and gynecological examination. The final optimal XGBoost model for predicting CIN2+ showed good discrimination (AUROC, maximum 0.989 [0.987–0.992]; minimum 0.781 [0.74–0.819]), and calibration (brier score, maximum 0.118 [0.099–0.137]) in the five external validation sets. DCA showed that when the clinical decision threshold probability for optimal XGBoost model was less than 0.80, the model for predicting CIN2+ provided a superior standardized net benefit. The optimal XGBoost model obtained similar results in predicting CIN3+. Interpretation: We developed a cervical cancer screening risk prediction model that employs hrHPV full genotyping and simple test results to achieve risk prediction and stratified management for colposcopy referrals. This predictive tool is particularly suitable for settings with low screening rates. Funding: National Natural Science Foundation of China; Major Scientific Research Program for Young and Middle-aged Health Professionals of Fujian Province, China; Fujian Province Central Government-Guided Local Science and Technology Development Project; Fujian Province's Third Batch of Flexible Introduction of High-Level Medical Talent Teams; Fujian Provincial Natural Science Foundation of China; Fujian Provincial Science and Technology Innovation Joint Fund.
ISSN:2666-6065