PT - JOURNAL ARTICLE AU - Bernardo Gutierrez AU - Hugo G. Castelán Sánchez AU - Darlan da Silva Candido AU - Ben Jackson AU - Shay Fleishon AU - Christopher Ruis AU - Luis Delaye AU - Andrew Rambaut AU - Oliver G. Pybus AU - Marina Escalera-Zamudio TI - Emergence and widespread circulation of a recombinant SARS-CoV-2 lineage in North America AID - 10.1101/2021.11.19.21266601 DP - 2021 Jan 01 TA - medRxiv PG - 2021.11.19.21266601 4099 - http://medrxiv.org/content/early/2021/11/21/2021.11.19.21266601.short 4100 - http://medrxiv.org/content/early/2021/11/21/2021.11.19.21266601.full AB - Genetic recombination is an important driving force of coronavirus evolution. While some degree of virus recombination has been reported during the COVID-19 pandemic, previously detected recombinant lineages of SARS-CoV-2 have shown limited circulation and been observed only in restricted areas. Prompted by reports of unusual genetic similarities among several Pango lineages detected mainly in North and Central America, we present a detailed phylogenetic analysis of four SARS-CoV-2 lineages (B.1.627, B.1.628, B.1.631 and B.1.634) in order to investigate the possibility of virus recombination among them. Two of these lineages, B.1.628 and B.1.631, are split into two distinct clusters (here named major and minor). Our phylogenetic and recombination analyses of these lineages find well-supported phylogenetic differences between the Orf1ab region and the rest of the genome (S protein and remaining reading frames). The lineages also contain several deletions in the NSP6, Orf3a and S proteins that can augment reconstruction of reliable evolutionary histories. By reconciling the deletions and phylogenetic data, we conclude that the B.1.628 major cluster originated from a recombination event between a B.1.631 major virus and a lineage B.1.634 virus. This scenario inferred from genetic data is supported by the spatial and temporal distribution of the three lineages, which all co-circulated in the USA and Mexico during 2021, suggesting this region is where the recombination event took place. We therefore support the designation of the B.1.628 major cluster as recombinant lineage XB in the Pango nomenclature. The widespread circulation of lineage XB across multiple countries over a longer timespan than the previously designated recombinant XA lineage raises important questions regarding the role and potential effects of recombination on the evolution of SARS-CoV-2 during the ongoing COVID-19 pandemic.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported through the "Vigilancia Genomica del Virus SARS-CoV-2 en Mexico" grant from the National Council for Science and Technology-Mexico (CONACyT), by the Leverhulme Trust ECR Fellowship ECF-2019-542, the Secretariat for Higher Education, Science, Technology, and Innovation of the Republic of Ecuador, and the Oxford Martin School.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesViral genome sequences are publicly available from GISAID (www.gisaid.org). Code used for generating analyses and figures in this study is available on GitHub (https://github.com/BernardoGG/XB_lineage_investigation). https://www.gisaid.org https://github.com/BernardoGG/XB_lineage_investigation