PT - JOURNAL ARTICLE AU - Emma B. Hodcroft AU - Daryl B. Domman AU - Daniel J. Snyder AU - Kasopefoluwa Oguntuyo AU - Maarten Van Diest AU - Kenneth H. Densmore AU - Kurt C. Schwalm AU - Jon Femling AU - Jennifer L. Carroll AU - Rona S. Scott AU - Martha M. Whyte AU - Michael D. Edwards AU - Noah C. Hull AU - Christopher G. Kevil AU - John A. Vanchiere AU - Benhur Lee AU - Darrell L. Dinwiddie AU - Vaughn S. Cooper AU - Jeremy P. Kamil TI - Emergence in late 2020 of multiple lineages of SARS-CoV-2 Spike protein variants affecting amino acid position 677 AID - 10.1101/2021.02.12.21251658 DP - 2021 Jan 01 TA - medRxiv PG - 2021.02.12.21251658 4099 - http://medrxiv.org/content/early/2021/02/14/2021.02.12.21251658.1.short 4100 - http://medrxiv.org/content/early/2021/02/14/2021.02.12.21251658.1.full AB - The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein (S) plays critical roles in host cell entry. Non-synonymous substitutions affecting S are not uncommon and have become fixed in a number of SARS-CoV-2 lineages. A subset of such mutations enable escape from neutralizing antibodies or are thought to enhance transmission through mechanisms such as increased affinity for the cell entry receptor, angiotensin-converting enzyme 2 (ACE2). Independent genomic surveillance programs based in New Mexico and Louisiana contemporaneously detected the rapid rise of numerous clade 20G (lineage B.1.2) infections carrying a Q677P substitution in S. The variant was first detected in the US on October 23, yet between 01 Dec 2020 and 19 Jan 2021 it rose to represent 27.8% and 11.3% of all SARS-CoV-2 genomes sequenced from Louisiana and New Mexico, respectively. Q677P cases have been detected predominantly in the south central and southwest United States; as of 03 Feb 2021, GISAID data show 499 viral sequences of this variant from the USA. Phylogenetic analyses revealed the independent evolution and spread of at least six distinct Q677H sub-lineages, with first collection dates ranging from mid-August to late November 2020. Four 677H clades from clade 20G (B.1.2), 20A (B.1.234), and 20B (B.1.1.220, and B.1.1.222) each contain roughly 100 or fewer sequenced cases, while a distinct pair of clade 20G clusters are represented by 754 and 298 cases, respectively. Although sampling bias and founder effects may have contributed to the rise of S:677 polymorphic variants, the proximity of this position to the polybasic cleavage site at the S1/S2 boundary are consistent with its potential functional relevance during cell entry, suggesting parallel evolution of a trait that may confer an advantage in spread or transmission. Taken together, our findings demonstrate simultaneous convergent evolution, thus providing an impetus to further evaluate S:677 polymorphisms for effects on proteolytic processing, cell tropism, and transmissibility.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by a COVID-19 Fast Grants award from Emergent Ventures, an initiative of the Mercatus Center at George Mason University (J.P.K.), and by an intramural grant and other funding from the Office of the Vice Chancellor for Research at LSU Health Sciences Center Shreveport (J.P.K., R.S.S., J.A.V.); an Institutional Development Award from the National Institutes of General Medical Sciences of the NIH under grant number P20 GM121307 (C.G.K.); by the Swiss National Science Foundation through grant number 31CA30196046 (E.B.H.), by the U.S. National Center for Research Resources and the National Center for Advancing Translational Sciences of the National Institutes of Health through Grant Number UL1TR001449 (D.L.D., D.B.D), a KL2 Mentored Career Development Award KL2R001448 (D.B.D), and a Translational and Clinical Pilot Project Award CTSC008-11 (D.B.D). We would like to thank the UNM Center for Advanced Research Computing, supported in part by the National Science Foundation, for providing the high performance computing resources used in this work. The authors have not received payment or services from any third parties for any aspect of the submitted work, other than the grant funding explicitly listed above. Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:SARS-CoV-2 genome sequences were generated under IRB approved protocols STUDY00001445 (LSU Health Sciences Center), and 14-039 and 20-151 (University of New Mexico Health Sciences Center).All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data for this manuscript is made available via websites and in publicly accessible databases. A time-stamped version of the Nextstrain build used in this manuscript can be found here: https://nextstrain.org/groups/neherlab/ncov/S.Q677/2020-02-04, and the most recently updated version of this build can be viewed here: https://nextstrain.org/groups/neherlab/ncov/S.Q677. https://nextstrain.org/groups/neherlab/ncov/S.Q677/2020-02-04