Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study

Objective: This study aimed to develop a hybrid model for variable selection in high-dimensional survival analysis using a support vector regression (SVR), to identify prognostic biomarkers associated with survival in oral cancer (OC) patients through the analysis of gene expression data.Materials a...

Full description

Saved in:
Bibliographic Details
Main Authors: Leila Nezamabadi Farahani, Anoshirvan Kazemnejad, Mahlagha Afrasiabi, Leili Tapak
Format: Article
Language:English
Published: Royan Institute (ACECR), Tehran 2024-12-01
Series:Cell Journal
Subjects:
Online Access:https://www.celljournal.org/article_724800_d41d8cd98f00b204e9800998ecf8427e.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849723652448190464
author Leila Nezamabadi Farahani
Anoshirvan Kazemnejad
Mahlagha Afrasiabi
Leili Tapak
author_facet Leila Nezamabadi Farahani
Anoshirvan Kazemnejad
Mahlagha Afrasiabi
Leili Tapak
author_sort Leila Nezamabadi Farahani
collection DOAJ
description Objective: This study aimed to develop a hybrid model for variable selection in high-dimensional survival analysis using a support vector regression (SVR), to identify prognostic biomarkers associated with survival in oral cancer (OC) patients through the analysis of gene expression data.Materials and Methods: In this retrospective cohort study, gene expression profiles (54,613 probes) related to 97 patients from the GSE41613 dataset from the GEO repository were used. First of all, martingale residuals were obtained using a Cox regression without covariates, and were used as pseudo-survival outcome. Then, the particle swarm optimization (PSO) and genetic algorithm (GA) were used in combination with SVR for selecting features related to pseudo-survival outcome. Concordance index (C-index), mean absolute error (MAE), mean squared error (MSE) and R-squares, were used to evaluate the performance of the models using selected features. Functional enrichment analysis was performed using DAVID database, and external validation utilized three independent datasets (GSE9844, GSE75538, GSE37991, GSE42743).Results: The findings indicated that the PSO-based method outperformed the GA-based method, achieving a smaller MAE (0.061) and MSE (0.005), R-square (0.99) and C-index (0.973), selecting 291 probes from 1069 screened. A protein-protein interaction (PPI) network was constructed, including 200 nodes and 120 edges. Eleven key genes with the highest degree, including RBM25, SMC3, PRPF40A, POLE, SRRT, BCLAF1, PDS5B, HNRNPR, JAK1, MED23, and SULT1A1 were identified as significant biomarkers associated with OC survival.Conclusion: The PSO-based hybrid model effectively improved SVR performance in survival prediction for OC patients and identified key prognostic biomarkers. Despite its promising results and validation on independent datasets, limitations in generalizability and signs of overfitting suggest the model is not yet ready for clinical use. Further studies with larger, diverse datasets are recommended.
format Article
id doaj-art-7e76a426fcdb4dbdb4bf101eabccc708
institution DOAJ
issn 2228-5806
2228-5814
language English
publishDate 2024-12-01
publisher Royan Institute (ACECR), Tehran
record_format Article
series Cell Journal
spelling doaj-art-7e76a426fcdb4dbdb4bf101eabccc7082025-08-20T03:10:58ZengRoyan Institute (ACECR), TehranCell Journal2228-58062228-58142024-12-01261268869910.22074/cellj.2025.2034704.1618724800Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort StudyLeila Nezamabadi Farahani0Anoshirvan Kazemnejad1Mahlagha Afrasiabi2Leili Tapak3Department of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, IranDepartment of Biostatistics, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, IranDepartment of Computer, Hamedan University of Technology, Hamedan, IranModeling of Noncommunicable Diseases Research Center, Institute of Health Sciences and Technologies, Hamadan University of Medical Sciences, Hamadan, IranObjective: This study aimed to develop a hybrid model for variable selection in high-dimensional survival analysis using a support vector regression (SVR), to identify prognostic biomarkers associated with survival in oral cancer (OC) patients through the analysis of gene expression data.Materials and Methods: In this retrospective cohort study, gene expression profiles (54,613 probes) related to 97 patients from the GSE41613 dataset from the GEO repository were used. First of all, martingale residuals were obtained using a Cox regression without covariates, and were used as pseudo-survival outcome. Then, the particle swarm optimization (PSO) and genetic algorithm (GA) were used in combination with SVR for selecting features related to pseudo-survival outcome. Concordance index (C-index), mean absolute error (MAE), mean squared error (MSE) and R-squares, were used to evaluate the performance of the models using selected features. Functional enrichment analysis was performed using DAVID database, and external validation utilized three independent datasets (GSE9844, GSE75538, GSE37991, GSE42743).Results: The findings indicated that the PSO-based method outperformed the GA-based method, achieving a smaller MAE (0.061) and MSE (0.005), R-square (0.99) and C-index (0.973), selecting 291 probes from 1069 screened. A protein-protein interaction (PPI) network was constructed, including 200 nodes and 120 edges. Eleven key genes with the highest degree, including RBM25, SMC3, PRPF40A, POLE, SRRT, BCLAF1, PDS5B, HNRNPR, JAK1, MED23, and SULT1A1 were identified as significant biomarkers associated with OC survival.Conclusion: The PSO-based hybrid model effectively improved SVR performance in survival prediction for OC patients and identified key prognostic biomarkers. Despite its promising results and validation on independent datasets, limitations in generalizability and signs of overfitting suggest the model is not yet ready for clinical use. Further studies with larger, diverse datasets are recommended.https://www.celljournal.org/article_724800_d41d8cd98f00b204e9800998ecf8427e.pdffeature selectiongene expressiongenetic algorithmmouth neoplasmsparticle swarm optimization
spellingShingle Leila Nezamabadi Farahani
Anoshirvan Kazemnejad
Mahlagha Afrasiabi
Leili Tapak
Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study
Cell Journal
feature selection
gene expression
genetic algorithm
mouth neoplasms
particle swarm optimization
title Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study
title_full Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study
title_fullStr Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study
title_full_unstemmed Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study
title_short Unlocking The Potential of Hybrid Models for Prognostic Biomarker Discovery in Oral Cancer Survival Analysis: A Retrospective Cohort Study
title_sort unlocking the potential of hybrid models for prognostic biomarker discovery in oral cancer survival analysis a retrospective cohort study
topic feature selection
gene expression
genetic algorithm
mouth neoplasms
particle swarm optimization
url https://www.celljournal.org/article_724800_d41d8cd98f00b204e9800998ecf8427e.pdf
work_keys_str_mv AT leilanezamabadifarahani unlockingthepotentialofhybridmodelsforprognosticbiomarkerdiscoveryinoralcancersurvivalanalysisaretrospectivecohortstudy
AT anoshirvankazemnejad unlockingthepotentialofhybridmodelsforprognosticbiomarkerdiscoveryinoralcancersurvivalanalysisaretrospectivecohortstudy
AT mahlaghaafrasiabi unlockingthepotentialofhybridmodelsforprognosticbiomarkerdiscoveryinoralcancersurvivalanalysisaretrospectivecohortstudy
AT leilitapak unlockingthepotentialofhybridmodelsforprognosticbiomarkerdiscoveryinoralcancersurvivalanalysisaretrospectivecohortstudy