The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis

Introduction. Hepatocellular carcinoma (HCC) accounts for approximately 90% of primary liver malignancies and is currently the fourth most common cause of cancer-related death worldwide. Due to varying underlying etiologies, the prognosis of HCC differs greatly among patients. It is important to dev...

Full description

Saved in:
Bibliographic Details
Main Authors: Samer Tohme, Hamza O Yazdani, Amaan Rahman, Sanah Handu, Sidrah Khan, Tanner Wilson, David A Geller, Richard L Simmons, Michele Molinari, Christof Kaltenmeier
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Canadian Journal of Gastroenterology and Hepatology
Online Access:http://dx.doi.org/10.1155/2021/5212953
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832562796139642880
author Samer Tohme
Hamza O Yazdani
Amaan Rahman
Sanah Handu
Sidrah Khan
Tanner Wilson
David A Geller
Richard L Simmons
Michele Molinari
Christof Kaltenmeier
author_facet Samer Tohme
Hamza O Yazdani
Amaan Rahman
Sanah Handu
Sidrah Khan
Tanner Wilson
David A Geller
Richard L Simmons
Michele Molinari
Christof Kaltenmeier
author_sort Samer Tohme
collection DOAJ
description Introduction. Hepatocellular carcinoma (HCC) accounts for approximately 90% of primary liver malignancies and is currently the fourth most common cause of cancer-related death worldwide. Due to varying underlying etiologies, the prognosis of HCC differs greatly among patients. It is important to develop ways to help stratify patients upon initial diagnosis to provide optimal treatment modalities and follow-up plans. The current study uses Artificial Neural Network (ANN) and Classification Tree Analysis (CTA) to create a gene signature score that can help predict survival in patients with HCC. Methods. The Cancer Genome Atlas (TCGA-LIHC) was analyzed for differentially expressed genes. Clinicopathological data were obtained from cBioPortal. ANN analysis of the 75 most significant genes predicting disease-free survival (DFS) was performed. Next, CTA results were used for creation of the scoring system. Cox regression was performed to identify the prognostic value of the scoring system. Results. 363 patients diagnosed with HCC were analyzed in this study. ANN provided 15 genes with normalized importance >50%. CTA resulted in a set of three genes (NRM, STAG3, and SNHG20). Patients were then divided in to 4 groups based on the CTA tree cutoff values. The Kaplan–Meier analysis showed significantly reduced DFS in groups 1, 2, and 3 (median DFS: 29.7 months, 16.1 months, and 11.7 months, p < 0.01) compared to group 0 (median not reached). Similar results were observed when overall survival (OS) was analyzed. On multivariate Cox regression, higher scores were associated with significantly shorter DFS (1 point: HR 2.57 (1.38–4.80), 2 points: 3.91 (2.11–7.24), and 3 points: 5.09 (2.70–9.58), p < 0.01). Conclusion. Long-term outcomes of patients with HCC can be predicted using a simplified scoring system based on tumor mRNA gene expression levels. This tool could assist clinicians and researchers in identifying patients at increased risks for recurrence to tailor specific treatment and follow-up strategies for individual patients.
format Article
id doaj-art-26c53fa847c641028b9b052b1d574e79
institution Kabale University
issn 2291-2797
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Canadian Journal of Gastroenterology and Hepatology
spelling doaj-art-26c53fa847c641028b9b052b1d574e792025-02-03T01:21:46ZengWileyCanadian Journal of Gastroenterology and Hepatology2291-27972021-01-01202110.1155/2021/5212953The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort AnalysisSamer Tohme0Hamza O Yazdani1Amaan Rahman2Sanah Handu3Sidrah Khan4Tanner Wilson5David A Geller6Richard L Simmons7Michele Molinari8Christof Kaltenmeier9Department of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryDepartment of SurgeryIntroduction. Hepatocellular carcinoma (HCC) accounts for approximately 90% of primary liver malignancies and is currently the fourth most common cause of cancer-related death worldwide. Due to varying underlying etiologies, the prognosis of HCC differs greatly among patients. It is important to develop ways to help stratify patients upon initial diagnosis to provide optimal treatment modalities and follow-up plans. The current study uses Artificial Neural Network (ANN) and Classification Tree Analysis (CTA) to create a gene signature score that can help predict survival in patients with HCC. Methods. The Cancer Genome Atlas (TCGA-LIHC) was analyzed for differentially expressed genes. Clinicopathological data were obtained from cBioPortal. ANN analysis of the 75 most significant genes predicting disease-free survival (DFS) was performed. Next, CTA results were used for creation of the scoring system. Cox regression was performed to identify the prognostic value of the scoring system. Results. 363 patients diagnosed with HCC were analyzed in this study. ANN provided 15 genes with normalized importance >50%. CTA resulted in a set of three genes (NRM, STAG3, and SNHG20). Patients were then divided in to 4 groups based on the CTA tree cutoff values. The Kaplan–Meier analysis showed significantly reduced DFS in groups 1, 2, and 3 (median DFS: 29.7 months, 16.1 months, and 11.7 months, p < 0.01) compared to group 0 (median not reached). Similar results were observed when overall survival (OS) was analyzed. On multivariate Cox regression, higher scores were associated with significantly shorter DFS (1 point: HR 2.57 (1.38–4.80), 2 points: 3.91 (2.11–7.24), and 3 points: 5.09 (2.70–9.58), p < 0.01). Conclusion. Long-term outcomes of patients with HCC can be predicted using a simplified scoring system based on tumor mRNA gene expression levels. This tool could assist clinicians and researchers in identifying patients at increased risks for recurrence to tailor specific treatment and follow-up strategies for individual patients.http://dx.doi.org/10.1155/2021/5212953
spellingShingle Samer Tohme
Hamza O Yazdani
Amaan Rahman
Sanah Handu
Sidrah Khan
Tanner Wilson
David A Geller
Richard L Simmons
Michele Molinari
Christof Kaltenmeier
The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis
Canadian Journal of Gastroenterology and Hepatology
title The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis
title_full The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis
title_fullStr The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis
title_full_unstemmed The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis
title_short The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis
title_sort use of machine learning to create a risk score to predict survival in patients with hepatocellular carcinoma a tcga cohort analysis
url http://dx.doi.org/10.1155/2021/5212953
work_keys_str_mv AT samertohme theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT hamzaoyazdani theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT amaanrahman theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT sanahhandu theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT sidrahkhan theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT tannerwilson theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT davidageller theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT richardlsimmons theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT michelemolinari theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT christofkaltenmeier theuseofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT samertohme useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT hamzaoyazdani useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT amaanrahman useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT sanahhandu useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT sidrahkhan useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT tannerwilson useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT davidageller useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT richardlsimmons useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT michelemolinari useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis
AT christofkaltenmeier useofmachinelearningtocreateariskscoretopredictsurvivalinpatientswithhepatocellularcarcinomaatcgacohortanalysis