RT Journal Article SR Electronic T1 Analysis of 2.1 million SARS-CoV-2 genomes identifies mutations associated with transmissibility JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.09.07.21263228 DO 10.1101/2021.09.07.21263228 A1 Fritz Obermeyer A1 Stephen F. Schaffner A1 Martin Jankowiak A1 Nikolaos Barkas A1 Jesse D. Pyle A1 Daniel J. Park A1 Bronwyn L. MacInnis A1 Jeremy Luban A1 Pardis C. Sabeti A1 Jacob E. Lemieux YR 2021 UL http://medrxiv.org/content/early/2021/09/13/2021.09.07.21263228.abstract AB Repeated emergence of SARS-CoV-2 variants with increased transmissibility necessitates rapid detection and characterization of new lineages. To address this need, we developed PyR0, a hierarchical Bayesian multinomial logistic regression model that infers relative transmissibility of all viral lineages across geographic regions, detects lineages increasing in prevalence, and identifies mutations relevant to transmissibility. Applying PyR0 to all publicly available SARS-CoV-2 genomes, we identify numerous substitutions that increase transmissibility, including previously identified spike mutations and many non-spike mutations within the nucleocapsid and nonstructural proteins. PyR0 forecasts growth of new lineages from their mutational profile, identifies viral lineages of concern as they emerge, and prioritizes mutations of biological and public health concern for functional characterization.One Sentence summary A Bayesian hierarchical model of all viral genomes predicts lineage transmissibility and identifies associated mutations.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialStudy is based on SARS-CoV-2 genetic sequences publicly available at GISAID.org.Clinical Protocols https://github.com/broadinstitute/pyro-cov Funding StatementThis work was sponsored by the U.S. Centers for Disease Control and Prevention (BAA), as well as support from the Doris Duke Charitable Foundation (J.E.L.), the Howard Hughes Medical Institute (P.C.S.), and the Evergrande COVID-19 Response Fund Award from the Massachusetts Consortium on Pathogen Readiness (J.L.).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study was conducted using data from a public database (GISAID). No IRB approval is necessary.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data was gathered from other public resources. Data preprocessing scripts are open source. https://gisaid.org https://github.com/CSSEGISandData/COVID-19 https://cov-lineages.org/