Development and validation of machine learning models for MASLD: based on multiple potential screening indicators

BackgroundMultifaceted factors play a crucial role in the prevention and treatment of metabolic dysfunction-associated steatotic liver disease (MASLD). This study aimed to utilize multifaceted indicators to construct MASLD risk prediction machine learning models and explore the core factors within t...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao Chen, Jingjing Zhang, Xueqin Chen, Ling Luo, Wenjiao Dong, Yongjie Wang, Jiyu Zhou, Canjin Chen, Wenhao Wang, Wenbin Zhang, Zhiyi Zhang, Yongguang Cai, Danli Kong, Yuanlin Ding
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Endocrinology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fendo.2024.1449064/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832592635445903360
author Hao Chen
Jingjing Zhang
Xueqin Chen
Ling Luo
Wenjiao Dong
Yongjie Wang
Jiyu Zhou
Canjin Chen
Wenhao Wang
Wenbin Zhang
Zhiyi Zhang
Yongguang Cai
Danli Kong
Yuanlin Ding
author_facet Hao Chen
Jingjing Zhang
Xueqin Chen
Ling Luo
Wenjiao Dong
Yongjie Wang
Jiyu Zhou
Canjin Chen
Wenhao Wang
Wenbin Zhang
Zhiyi Zhang
Yongguang Cai
Danli Kong
Yuanlin Ding
author_sort Hao Chen
collection DOAJ
description BackgroundMultifaceted factors play a crucial role in the prevention and treatment of metabolic dysfunction-associated steatotic liver disease (MASLD). This study aimed to utilize multifaceted indicators to construct MASLD risk prediction machine learning models and explore the core factors within these models.MethodsMASLD risk prediction models were constructed based on seven machine learning algorithms using all variables, insulin-related variables, demographic characteristics variables, and other indicators, respectively. Subsequently, the partial dependence plot(PDP) method and SHapley Additive exPlanations (SHAP) were utilized to explain the roles of important variables in the model to filter out the optimal indicators for constructing the MASLD risk model.ResultsRanking the feature importance of the Random Forest (RF) model and eXtreme Gradient Boosting (XGBoost) model constructed using all variables found that both homeostasis model assessment of insulin resistance (HOMA-IR) and triglyceride glucose-waist circumference (TyG-WC) were the first and second most important variables. The MASLD risk prediction model constructed using the variables with top 10 importance was superior to the previous model. The PDP and SHAP methods were further utilized to screen the best indicators (including HOMA-IR, TyG-WC, age, aspartate aminotransferase (AST), and ethnicity) for constructing the model, and the mean area under the curve value of the models was 0.960.ConclusionsHOMA-IR and TyG-WC are core factors in predicting MASLD risk. Ultimately, our study constructed the optimal MASLD risk prediction model using HOMA-IR, TyG-WC, age, AST, and ethnicity.
format Article
id doaj-art-428917b1e3704551b9d75437e85f0145
institution Kabale University
issn 1664-2392
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Endocrinology
spelling doaj-art-428917b1e3704551b9d75437e85f01452025-01-21T05:43:43ZengFrontiers Media S.A.Frontiers in Endocrinology1664-23922025-01-011510.3389/fendo.2024.14490641449064Development and validation of machine learning models for MASLD: based on multiple potential screening indicatorsHao Chen0Jingjing Zhang1Xueqin Chen2Ling Luo3Wenjiao Dong4Yongjie Wang5Jiyu Zhou6Canjin Chen7Wenhao Wang8Wenbin Zhang9Zhiyi Zhang10Yongguang Cai11Danli Kong12Yuanlin Ding13Department of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Medical Oncology, Central Hospital of Guangdong Nongken, Zhanjiang, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaBackgroundMultifaceted factors play a crucial role in the prevention and treatment of metabolic dysfunction-associated steatotic liver disease (MASLD). This study aimed to utilize multifaceted indicators to construct MASLD risk prediction machine learning models and explore the core factors within these models.MethodsMASLD risk prediction models were constructed based on seven machine learning algorithms using all variables, insulin-related variables, demographic characteristics variables, and other indicators, respectively. Subsequently, the partial dependence plot(PDP) method and SHapley Additive exPlanations (SHAP) were utilized to explain the roles of important variables in the model to filter out the optimal indicators for constructing the MASLD risk model.ResultsRanking the feature importance of the Random Forest (RF) model and eXtreme Gradient Boosting (XGBoost) model constructed using all variables found that both homeostasis model assessment of insulin resistance (HOMA-IR) and triglyceride glucose-waist circumference (TyG-WC) were the first and second most important variables. The MASLD risk prediction model constructed using the variables with top 10 importance was superior to the previous model. The PDP and SHAP methods were further utilized to screen the best indicators (including HOMA-IR, TyG-WC, age, aspartate aminotransferase (AST), and ethnicity) for constructing the model, and the mean area under the curve value of the models was 0.960.ConclusionsHOMA-IR and TyG-WC are core factors in predicting MASLD risk. Ultimately, our study constructed the optimal MASLD risk prediction model using HOMA-IR, TyG-WC, age, AST, and ethnicity.https://www.frontiersin.org/articles/10.3389/fendo.2024.1449064/fullmetabolic dysfunction-associated steatotic liver diseasemachine learninginsulin resistancetriglyceride glucoserisk prediction model
spellingShingle Hao Chen
Jingjing Zhang
Xueqin Chen
Ling Luo
Wenjiao Dong
Yongjie Wang
Jiyu Zhou
Canjin Chen
Wenhao Wang
Wenbin Zhang
Zhiyi Zhang
Yongguang Cai
Danli Kong
Yuanlin Ding
Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
Frontiers in Endocrinology
metabolic dysfunction-associated steatotic liver disease
machine learning
insulin resistance
triglyceride glucose
risk prediction model
title Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
title_full Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
title_fullStr Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
title_full_unstemmed Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
title_short Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
title_sort development and validation of machine learning models for masld based on multiple potential screening indicators
topic metabolic dysfunction-associated steatotic liver disease
machine learning
insulin resistance
triglyceride glucose
risk prediction model
url https://www.frontiersin.org/articles/10.3389/fendo.2024.1449064/full
work_keys_str_mv AT haochen developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT jingjingzhang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT xueqinchen developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT lingluo developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT wenjiaodong developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT yongjiewang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT jiyuzhou developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT canjinchen developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT wenhaowang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT wenbinzhang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT zhiyizhang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT yongguangcai developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT danlikong developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators
AT yuanlinding developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators