Abstract
The emergence of a heavily mutated SARS-CoV-2 variant (Omicron; B.1.1.529/BA.1/BA.2) and its rapid spread globally created public health alarms. Characterizing the mutational profile of Omicron is necessary to interpret its shared or distinctive clinical phenotypes with other SARS-CoV-2 variants. We compared the mutations of Omicron with prior variants of concern (Alpha, Beta, Gamma, Delta), variants of interest (Lambda, Mu, Eta, Iota and Kappa), and ∼1500 SARS-CoV-2 lineages constituting ∼5.8 million SARS-CoV-2 genomes. Omicron’s Spike protein has 26 amino acid mutations (23 substitutions, two deletions and one insertion) that are distinct compared to other variants of concern. Whereas the substitution and deletion mutations have appeared in previous SARS-CoV-2 lineages, the insertion mutation (ins214EPE) has not been previously observed in any other SARS-CoV-2 lineage. Here, we discuss various mechanisms through which the nucleotide sequence encoding for ins214EPE could have been acquired and highlight the plausibility of template switching via either the human transcriptome or prior viral genomes. Analysis of homology of the inserted nucleotide sequence and flanking regions suggests that this template switching event could have involved the genomes of SARS-CoV-2 variants (e.g. B.1.1 strain), other human coronaviruses that infect the same host cells as SARS-CoV-2 (e.g. HCoV-OC43 or HCoV-229E), or a human transcript expressed in a host cell that was infected by the Omicron precursor. Whether ins214EPE impacts the epidemiological or clinical properties of Omicron (e.g. transmissibility) warrants further investigation. There is also a need to understand whether human host cells are being exploited by SARS-CoV-2 as an ‘evolutionary sandbox’ for inter-viral or host-virus genomic interplay to produce new SARS-CoV-2 variants.
Competing Interest Statement
AJV, PA, PJL, RS, MJMN and VS are employees of nference and have financial interests in the company. nference collaborates with bio-pharmaceutical, medical device, diagnostics and software companies, as well as public health agencies, academic medical centers and health systems, on data science initiatives unrelated to this study. These collaborations had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Funding Statement
This study was self-funded by nference. No external funding was received for this study.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data produced in the present work are contained in the manuscript and accessible via the GISAID and NCBI resources.