A prediction of mutations in infectious viruses using artificial intelligence

Abstract Many subtypes of SARS-CoV-2 have emerged since its early stages, with mutations showing regional and racial differences. These mutations significantly affected the infectivity and severity of the virus. This study aimed to predict the mutations that occur during the evolution of SARS-CoV-2...

Full description

Saved in:
Bibliographic Details
Main Authors: Won Jong Choi, Jongkeun Park, Do Young Seong, Dae Sun Chung, Dongwan Hong
Format: Article
Language:English
Published: BioMed Central 2024-10-01
Series:Genomics & Informatics
Subjects:
Online Access:https://doi.org/10.1186/s44342-024-00019-y
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Many subtypes of SARS-CoV-2 have emerged since its early stages, with mutations showing regional and racial differences. These mutations significantly affected the infectivity and severity of the virus. This study aimed to predict the mutations that occur during the evolution of SARS-CoV-2 and identify the key characteristics for making these predictions. We collected and organized data on the lineage, date, clade, and mutations of SARS-CoV-2 from publicly available databases and processed them to predict the mutations. In addition, we utilized various artificial intelligence models to predict newly emerging mutations and created various training sets based on clade information. Using only mutation information resulted in low performance of the learning models, whereas incorporating clade differentiation resulted in high performance in machine learning models, including XGBoost (accuracy: 0.999). However, mutations fixed in the receptor-binding motif (RBM) region of Omicron resulted in decreased predictive performance. Using these models, we predicted potential mutation positions for 24C, following the recently emerged 24A and 24B clades. We identified a mutation at position Q493 in the RBM region. Our study developed effective artificial intelligence models and characteristics for predicting new mutations in continuously evolving infectious viruses.
ISSN:2234-0742