Construction of a feature gene and machine prediction model for inflammatory bowel disease based on multichip joint analysis

Abstract Background Inflammatory bowel disease (IBD) is a chronic nonspecific inflammatory disorder triggered by immune responses and genetic factors. Currently, there is no cure for IBD, and its etiology remains unclear. As a result, early detection and diagnosis of IBD pose significant challenges....

Full description

Saved in:
Bibliographic Details
Main Authors: Yan Chaosheng, Sun Haowen, Rao Jingjing, Dai Yuanyuan, Duan Wenhui, Sheng Yingyue, Xue Yuzheng
Format: Article
Language:English
Published: BMC 2025-08-01
Series:Journal of Translational Medicine
Subjects:
Online Access:https://doi.org/10.1186/s12967-025-06838-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Inflammatory bowel disease (IBD) is a chronic nonspecific inflammatory disorder triggered by immune responses and genetic factors. Currently, there is no cure for IBD, and its etiology remains unclear. As a result, early detection and diagnosis of IBD pose significant challenges. Therefore, investigating biomarkers in peripheral blood is highly important, as they can assist doctors in the early identification and management of IBD. Methods We used a multichip joint analysis approach to explore the database thoroughly. On the basis of methods such as artificial neural networks (ANNs), machine learning techniques, and the SHAP model, we developed a diagnostic model for IBD. To select genetic features, we utilized three machine learning algorithms, namely, least absolute shrinkage and selection operator (LASSO), support vector machine (SVM), and random forest (RF), to identify differentially expressed genes. Additionally, we conducted an in-depth analysis of the enriched molecular pathways of these differentially expressed genes through Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses. Moreover, we used the SHAP model to interpret the results of the machine learning process. Finally, we examined the relationships between the differentially expressed genes and immune cells. Results Through machine learning, we identified four crucial biomarkers for IBD, namely, LOC389023, DUOX2, LCN2, and DEFA6. The SHAP model was used to elucidate the contribution of the differentially expressed genes to the diagnostic model. These genes were associated primarily with immune system modulation and microbial alterations. GO and KEGG pathway enrichment analyses indicated that the differentially expressed genes demonstrated associations with molecular pathways such as the antimicrobial and IL-17 signaling pathways. By performing correlation and differential analyses between differentially expressed genes and immune cells, we found that M1 macrophages exhibited stable differential changes in all four differentially expressed genes. M2 macrophages, resting mast cells, neutrophils, and activated memory CD4 T cells all showed significant differences in three of the differentially expressed genes. Conclusion We identified differentially expressed genes (LOC389023, DUOX2, LCN2, and DEFA6) with significant immune-related effects in IBD. Our findings suggest that machine learning algorithms outperform ANNs in the diagnosis of IBD. This research provides a theoretical foundation for the clinical diagnosis, targeted therapy, and prognostic evaluation of IBD.
ISSN:1479-5876