Development and validation of machine learning models for MASLD: based on multiple potential screening indicators
BackgroundMultifaceted factors play a crucial role in the prevention and treatment of metabolic dysfunction-associated steatotic liver disease (MASLD). This study aimed to utilize multifaceted indicators to construct MASLD risk prediction machine learning models and explore the core factors within t...
Saved in:
Main Authors: | , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2025-01-01
|
Series: | Frontiers in Endocrinology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fendo.2024.1449064/full |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832592635445903360 |
---|---|
author | Hao Chen Jingjing Zhang Xueqin Chen Ling Luo Wenjiao Dong Yongjie Wang Jiyu Zhou Canjin Chen Wenhao Wang Wenbin Zhang Zhiyi Zhang Yongguang Cai Danli Kong Yuanlin Ding |
author_facet | Hao Chen Jingjing Zhang Xueqin Chen Ling Luo Wenjiao Dong Yongjie Wang Jiyu Zhou Canjin Chen Wenhao Wang Wenbin Zhang Zhiyi Zhang Yongguang Cai Danli Kong Yuanlin Ding |
author_sort | Hao Chen |
collection | DOAJ |
description | BackgroundMultifaceted factors play a crucial role in the prevention and treatment of metabolic dysfunction-associated steatotic liver disease (MASLD). This study aimed to utilize multifaceted indicators to construct MASLD risk prediction machine learning models and explore the core factors within these models.MethodsMASLD risk prediction models were constructed based on seven machine learning algorithms using all variables, insulin-related variables, demographic characteristics variables, and other indicators, respectively. Subsequently, the partial dependence plot(PDP) method and SHapley Additive exPlanations (SHAP) were utilized to explain the roles of important variables in the model to filter out the optimal indicators for constructing the MASLD risk model.ResultsRanking the feature importance of the Random Forest (RF) model and eXtreme Gradient Boosting (XGBoost) model constructed using all variables found that both homeostasis model assessment of insulin resistance (HOMA-IR) and triglyceride glucose-waist circumference (TyG-WC) were the first and second most important variables. The MASLD risk prediction model constructed using the variables with top 10 importance was superior to the previous model. The PDP and SHAP methods were further utilized to screen the best indicators (including HOMA-IR, TyG-WC, age, aspartate aminotransferase (AST), and ethnicity) for constructing the model, and the mean area under the curve value of the models was 0.960.ConclusionsHOMA-IR and TyG-WC are core factors in predicting MASLD risk. Ultimately, our study constructed the optimal MASLD risk prediction model using HOMA-IR, TyG-WC, age, AST, and ethnicity. |
format | Article |
id | doaj-art-428917b1e3704551b9d75437e85f0145 |
institution | Kabale University |
issn | 1664-2392 |
language | English |
publishDate | 2025-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Endocrinology |
spelling | doaj-art-428917b1e3704551b9d75437e85f01452025-01-21T05:43:43ZengFrontiers Media S.A.Frontiers in Endocrinology1664-23922025-01-011510.3389/fendo.2024.14490641449064Development and validation of machine learning models for MASLD: based on multiple potential screening indicatorsHao Chen0Jingjing Zhang1Xueqin Chen2Ling Luo3Wenjiao Dong4Yongjie Wang5Jiyu Zhou6Canjin Chen7Wenhao Wang8Wenbin Zhang9Zhiyi Zhang10Yongguang Cai11Danli Kong12Yuanlin Ding13Department of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Medical Oncology, Central Hospital of Guangdong Nongken, Zhanjiang, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaDepartment of Epidemiology and Medical Statistics School of Public Health, Guangdong Medical University, Dongguan, Guangdong, ChinaBackgroundMultifaceted factors play a crucial role in the prevention and treatment of metabolic dysfunction-associated steatotic liver disease (MASLD). This study aimed to utilize multifaceted indicators to construct MASLD risk prediction machine learning models and explore the core factors within these models.MethodsMASLD risk prediction models were constructed based on seven machine learning algorithms using all variables, insulin-related variables, demographic characteristics variables, and other indicators, respectively. Subsequently, the partial dependence plot(PDP) method and SHapley Additive exPlanations (SHAP) were utilized to explain the roles of important variables in the model to filter out the optimal indicators for constructing the MASLD risk model.ResultsRanking the feature importance of the Random Forest (RF) model and eXtreme Gradient Boosting (XGBoost) model constructed using all variables found that both homeostasis model assessment of insulin resistance (HOMA-IR) and triglyceride glucose-waist circumference (TyG-WC) were the first and second most important variables. The MASLD risk prediction model constructed using the variables with top 10 importance was superior to the previous model. The PDP and SHAP methods were further utilized to screen the best indicators (including HOMA-IR, TyG-WC, age, aspartate aminotransferase (AST), and ethnicity) for constructing the model, and the mean area under the curve value of the models was 0.960.ConclusionsHOMA-IR and TyG-WC are core factors in predicting MASLD risk. Ultimately, our study constructed the optimal MASLD risk prediction model using HOMA-IR, TyG-WC, age, AST, and ethnicity.https://www.frontiersin.org/articles/10.3389/fendo.2024.1449064/fullmetabolic dysfunction-associated steatotic liver diseasemachine learninginsulin resistancetriglyceride glucoserisk prediction model |
spellingShingle | Hao Chen Jingjing Zhang Xueqin Chen Ling Luo Wenjiao Dong Yongjie Wang Jiyu Zhou Canjin Chen Wenhao Wang Wenbin Zhang Zhiyi Zhang Yongguang Cai Danli Kong Yuanlin Ding Development and validation of machine learning models for MASLD: based on multiple potential screening indicators Frontiers in Endocrinology metabolic dysfunction-associated steatotic liver disease machine learning insulin resistance triglyceride glucose risk prediction model |
title | Development and validation of machine learning models for MASLD: based on multiple potential screening indicators |
title_full | Development and validation of machine learning models for MASLD: based on multiple potential screening indicators |
title_fullStr | Development and validation of machine learning models for MASLD: based on multiple potential screening indicators |
title_full_unstemmed | Development and validation of machine learning models for MASLD: based on multiple potential screening indicators |
title_short | Development and validation of machine learning models for MASLD: based on multiple potential screening indicators |
title_sort | development and validation of machine learning models for masld based on multiple potential screening indicators |
topic | metabolic dysfunction-associated steatotic liver disease machine learning insulin resistance triglyceride glucose risk prediction model |
url | https://www.frontiersin.org/articles/10.3389/fendo.2024.1449064/full |
work_keys_str_mv | AT haochen developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT jingjingzhang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT xueqinchen developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT lingluo developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT wenjiaodong developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT yongjiewang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT jiyuzhou developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT canjinchen developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT wenhaowang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT wenbinzhang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT zhiyizhang developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT yongguangcai developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT danlikong developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators AT yuanlinding developmentandvalidationofmachinelearningmodelsformasldbasedonmultiplepotentialscreeningindicators |