On the space of SARS-CoV-2 genetic sequence variants

The coronavirus pandemic caused by the SARS-CoV-2 virus, which humanity resisted using the latest advances in science, left behind, among other things, extensive genetic data. Every day since the end of 2019, samples of the virus genomes have been collected around the world, which makes it possible...

Full description

Saved in:
Bibliographic Details
Main Authors: A. Yu. Palyanov, N. V. Palyanova
Format: Article
Language:English
Published: Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders 2023-12-01
Series:Вавиловский журнал генетики и селекции
Subjects:
Online Access:https://vavilov.elpub.ru/jour/article/view/3984
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832575008833011712
author A. Yu. Palyanov
N. V. Palyanova
author_facet A. Yu. Palyanov
N. V. Palyanova
author_sort A. Yu. Palyanov
collection DOAJ
description The coronavirus pandemic caused by the SARS-CoV-2 virus, which humanity resisted using the latest advances in science, left behind, among other things, extensive genetic data. Every day since the end of 2019, samples of the virus genomes have been collected around the world, which makes it possible to trace its evolution in detail from its emergence to the present. The accumulated statistics of testing results showed that the number of confirmed cases of SARS-CoV-2 infection was at least 767.5 million (9.5 % of the current world population, excluding asymptomatic people), and the number of sequenced virus genomes is more than 15.7 million (which is over 2 % of the total number of infected people). These new data potentially contain information about the mechanisms of the variability and spread of the virus, its interaction with the human immune system, the main parameters characterizing the mechanisms of the development of a pandemic, and much more. In this article, we analyze the space of possible variants of SARS-CoV-2 genetic sequences both from a mathematical point of view and taking into account the biological limitations inherent in this system, known both from general biological knowledge and from the consideration of the characteristics of this particular virus. We have developed software capable of loading and analyzing SARS-CoV-2 nucleotide sequences in FASTA format, determining the 5’ and 3’ UTR positions, the number and location of unidentified nucleotides (“N”), performing alignment with the reference sequence by calling the program designed for this, determining mutations, deletions and insertions, as well as calculating various characteris tics of virus genomes with a given time step (days, weeks, months, etc.). The data obtained indicate that, despite the apparent mathematical diversity of possible options for changing the virus over time, the corridor of the evolutionary  trajectory that the coronavirus has passed through seems to be quite narrow. Thus it can be assumed that it is determined to some extent, which allows us to hope for a possibility of modeling the evolution of the coronavirus.
format Article
id doaj-art-39cc9147be684afa9989a008a3891292
institution Kabale University
issn 2500-3259
language English
publishDate 2023-12-01
publisher Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders
record_format Article
series Вавиловский журнал генетики и селекции
spelling doaj-art-39cc9147be684afa9989a008a38912922025-02-01T09:58:12ZengSiberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and BreedersВавиловский журнал генетики и селекции2500-32592023-12-0127783985010.18699/VJGB-23-971412On the space of SARS-CoV-2 genetic sequence variantsA. Yu. Palyanov0N. V. Palyanova1A.P. Ershov Institute of Informatics Systems of the Siberian Branch of the Russian Academy of Sciences; Research Institute of Virology, Federal Research Center of Fundamental and Translational Medicine of the Siberian Branch of the Russian Academy of Sciences; Novosibirsk State UniversityResearch Institute of Virology, Federal Research Center of Fundamental and Translational Medicine of the Siberian Branch of the Russian Academy of SciencesThe coronavirus pandemic caused by the SARS-CoV-2 virus, which humanity resisted using the latest advances in science, left behind, among other things, extensive genetic data. Every day since the end of 2019, samples of the virus genomes have been collected around the world, which makes it possible to trace its evolution in detail from its emergence to the present. The accumulated statistics of testing results showed that the number of confirmed cases of SARS-CoV-2 infection was at least 767.5 million (9.5 % of the current world population, excluding asymptomatic people), and the number of sequenced virus genomes is more than 15.7 million (which is over 2 % of the total number of infected people). These new data potentially contain information about the mechanisms of the variability and spread of the virus, its interaction with the human immune system, the main parameters characterizing the mechanisms of the development of a pandemic, and much more. In this article, we analyze the space of possible variants of SARS-CoV-2 genetic sequences both from a mathematical point of view and taking into account the biological limitations inherent in this system, known both from general biological knowledge and from the consideration of the characteristics of this particular virus. We have developed software capable of loading and analyzing SARS-CoV-2 nucleotide sequences in FASTA format, determining the 5’ and 3’ UTR positions, the number and location of unidentified nucleotides (“N”), performing alignment with the reference sequence by calling the program designed for this, determining mutations, deletions and insertions, as well as calculating various characteris tics of virus genomes with a given time step (days, weeks, months, etc.). The data obtained indicate that, despite the apparent mathematical diversity of possible options for changing the virus over time, the corridor of the evolutionary  trajectory that the coronavirus has passed through seems to be quite narrow. Thus it can be assumed that it is determined to some extent, which allows us to hope for a possibility of modeling the evolution of the coronavirus.https://vavilov.elpub.ru/jour/article/view/3984coronavirussars-cov-2genomespace of variantsevolutionvariability
spellingShingle A. Yu. Palyanov
N. V. Palyanova
On the space of SARS-CoV-2 genetic sequence variants
Вавиловский журнал генетики и селекции
coronavirus
sars-cov-2
genome
space of variants
evolution
variability
title On the space of SARS-CoV-2 genetic sequence variants
title_full On the space of SARS-CoV-2 genetic sequence variants
title_fullStr On the space of SARS-CoV-2 genetic sequence variants
title_full_unstemmed On the space of SARS-CoV-2 genetic sequence variants
title_short On the space of SARS-CoV-2 genetic sequence variants
title_sort on the space of sars cov 2 genetic sequence variants
topic coronavirus
sars-cov-2
genome
space of variants
evolution
variability
url https://vavilov.elpub.ru/jour/article/view/3984
work_keys_str_mv AT ayupalyanov onthespaceofsarscov2geneticsequencevariants
AT nvpalyanova onthespaceofsarscov2geneticsequencevariants