Abstract
Nearly two decades after the last epidemic caused by a severe acute respiratory syndrome coronavirus (SARS-CoV), newly emerged SARS-CoV-2 quickly spread in 2020 and precipitated an ongoing global public health crisis. Both the continuous accumulation of point mutations, owed to the naturally imposed genomic plasticity of SARS-CoV-2 evolutionary processes, as well as viral spread over time, allow this RNA virus to gain new genetic identities, spawn novel variants and enhance its potential for immune evasion. Here, through an in-depth phylogenetic clustering analysis of upwards of 200,000 whole-genome sequences, we reveal the presence of not previously reported and hitherto unidentified mutations and recombination breakpoints in Variants of Concern (VOC) and Variants of Interest (VOI) from Brazil, India (Beta, Eta and Kappa) and the USA (Beta, Eta and Lambda). Additionally, we identify sites with shared mutations under directional evolution in the SARS-CoV-2 Spike-encoding protein of VOC and VOI, tracing a heretofore-undescribed correlation with viral spread in South America, India and the USA. Our evidence-based analysis provides well-supported evidence of similar pathways of evolution for such mutations in all SARS-CoV-2 variants and sub-lineages. This raises two pivotal points: the co-circulation of variants and sub-lineages in close evolutionary environments, which sheds light onto their trajectories into convergent and directional evolution (i), and a linear perspective into the prospective vaccine efficacy against different SARS-CoV-2 strains (ii).
Author summary In this study, through analysis of very robust and comprehensive datasets, we identify a plethora of mutations in the SARS-CoV-2 Spike cell surface protein of several variants of concern and multiple variants of interest. We trace an association of such mutations with viral spread in different countries. We further infer the presence of new SARS-CoV-2 sublineages and show that the vast majority of mutations identified in the SARS-CoV-2 Spike protein are under convergent evolution. If we consider every color of a Rubik’s cube’s face to represent a different mutation of a particular variant, evolutionary convergence can be achieved only when all composite pieces of a single face are of the same color and every face has one unique color. Overall, this raises two important points: we provide insight into the presence of SARS-CoV-2 variants and sub-lineages circulating in very close evolutionary environments and our analyses can serve to facilitate an outlook into the prospective vaccine efficacy against different SARS-CoV-2 strains.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by the Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP), Brazil, grants 2019/01255-9 and 2021/03684-4 (Young Investigator Program) (RD-C), and by the Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq), Brazil, grant 405691/2018-1 (C.T.B). DRN is recipient of an institutional scholarship from the Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES), Brazil, grant 88887.5062234/2020-00.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study involved only viral sequences from public Databases.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data are publicly available at https://github.com/rosadanilo/SARS-CoV-2_DEPS