Abstract
Unprecedented sequencing efforts have, as of October 2020, produced nearly 200,000 genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for COVID-19. Understanding the trends in SARS-CoV-2 evolution is paramount to control the pandemic, but analysis of this enormous dataset is a major challenge. We show that the ongoing evolution of SARS-CoV-2 over the course of the pandemic is characterized primarily by purifying selection but a small set of sites, including spike 614 and nucleocapsid 203-204 appear to evolve under positive selection. In addition to the substitutions in the spike protein, multiple substitutions in the nucleocapsid protein appear to be important for SARS-CoV-2 adaptation to the human host. The positively selected mutations form a strongly connected network of apparent epistatic interactions and are signatures of major partitions in the SARS-CoV-2 phylogeny. These partitions show distinct spatial and temporal dynamics, with both globalization and diversification trends being apparent.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
This version of the manuscript has been revised to update the collection of the SARS-CoV-2 genomes that were analyzed to identify the evolutionary regime of the virus along with the global and regional evolutionary trends.