Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.

Hypertension is a critical risk factor and cause of mortality in cardiovascular diseases, and it remains a global public health issue. Therefore, understanding its mechanisms is essential for treating and preventing hypertension. Gene expression data is an important source for obtaining hypertension...

Full description

Saved in:
Bibliographic Details
Main Authors: Zongjin Li, Liqin Tian, Libing Bai, Zeyu Jia, Xiaoming Wu, Changxin Song
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0314319
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832540184401412096
author Zongjin Li
Liqin Tian
Libing Bai
Zeyu Jia
Xiaoming Wu
Changxin Song
author_facet Zongjin Li
Liqin Tian
Libing Bai
Zeyu Jia
Xiaoming Wu
Changxin Song
author_sort Zongjin Li
collection DOAJ
description Hypertension is a critical risk factor and cause of mortality in cardiovascular diseases, and it remains a global public health issue. Therefore, understanding its mechanisms is essential for treating and preventing hypertension. Gene expression data is an important source for obtaining hypertension biomarkers. However, this data has a small sample size and high feature dimensionality, posing challenges to biomarker identification. We propose a novel deep graph clustering feature selection (DeepGCFS) algorithm to identify hypertension gene biomarkers with more biological significance. This algorithm utilizes a graph network to represent the interaction information between genes, builds a GNN model, designs a loss function based on link prediction and self-supervised learning ideas for training, and allows each gene node to obtain a feature vector representing global information. The algorithm then uses hybrid clustering methods for gene module detection. Finally, it combines integrated feature selection methods to determine the gene biomarkers. The experiment revealed that all the ten identified hypertension biomarkers were significantly differentiated, and it was found that the classification performance of AUC can reach 97.50%, which is better than other literature methods. Six genes (PTGS2, TBXA2R, ZNF101, KCNJ2, MSRA, and CMTM5) have been reported to be associated with hypertension. By using GSE113439 as the validation dataset, the AUC value of classification performance was to be 95.45%, and seven of the genes (LYSMD3, TBXA2R, KLC3, GPR171, PTGS2, MSRA, and CMTM5) were to be significantly different. In addition, this algorithm's performance of gene feature vector clustering was better than other comparative methods. Therefore, the proposed algorithm has significant advantages in selecting potential hypertension biomarkers.
format Article
id doaj-art-642690e83a6c4f0699eb200795343b60
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-642690e83a6c4f0699eb200795343b602025-02-05T05:32:12ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e031431910.1371/journal.pone.0314319Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.Zongjin LiLiqin TianLibing BaiZeyu JiaXiaoming WuChangxin SongHypertension is a critical risk factor and cause of mortality in cardiovascular diseases, and it remains a global public health issue. Therefore, understanding its mechanisms is essential for treating and preventing hypertension. Gene expression data is an important source for obtaining hypertension biomarkers. However, this data has a small sample size and high feature dimensionality, posing challenges to biomarker identification. We propose a novel deep graph clustering feature selection (DeepGCFS) algorithm to identify hypertension gene biomarkers with more biological significance. This algorithm utilizes a graph network to represent the interaction information between genes, builds a GNN model, designs a loss function based on link prediction and self-supervised learning ideas for training, and allows each gene node to obtain a feature vector representing global information. The algorithm then uses hybrid clustering methods for gene module detection. Finally, it combines integrated feature selection methods to determine the gene biomarkers. The experiment revealed that all the ten identified hypertension biomarkers were significantly differentiated, and it was found that the classification performance of AUC can reach 97.50%, which is better than other literature methods. Six genes (PTGS2, TBXA2R, ZNF101, KCNJ2, MSRA, and CMTM5) have been reported to be associated with hypertension. By using GSE113439 as the validation dataset, the AUC value of classification performance was to be 95.45%, and seven of the genes (LYSMD3, TBXA2R, KLC3, GPR171, PTGS2, MSRA, and CMTM5) were to be significantly different. In addition, this algorithm's performance of gene feature vector clustering was better than other comparative methods. Therefore, the proposed algorithm has significant advantages in selecting potential hypertension biomarkers.https://doi.org/10.1371/journal.pone.0314319
spellingShingle Zongjin Li
Liqin Tian
Libing Bai
Zeyu Jia
Xiaoming Wu
Changxin Song
Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.
PLoS ONE
title Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.
title_full Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.
title_fullStr Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.
title_full_unstemmed Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.
title_short Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.
title_sort identification of hypertension gene expression biomarkers based on the deepgcfs algorithm
url https://doi.org/10.1371/journal.pone.0314319
work_keys_str_mv AT zongjinli identificationofhypertensiongeneexpressionbiomarkersbasedonthedeepgcfsalgorithm
AT liqintian identificationofhypertensiongeneexpressionbiomarkersbasedonthedeepgcfsalgorithm
AT libingbai identificationofhypertensiongeneexpressionbiomarkersbasedonthedeepgcfsalgorithm
AT zeyujia identificationofhypertensiongeneexpressionbiomarkersbasedonthedeepgcfsalgorithm
AT xiaomingwu identificationofhypertensiongeneexpressionbiomarkersbasedonthedeepgcfsalgorithm
AT changxinsong identificationofhypertensiongeneexpressionbiomarkersbasedonthedeepgcfsalgorithm