TY - JOUR T1 - Landscape of SARS-CoV-2 genomic surveillance, public availability extent of genomic data, and epidemic shaped by variants: a global descriptive study JF - medRxiv DO - 10.1101/2021.09.06.21263152 SP - 2021.09.06.21263152 AU - Zhiyuan Chen AU - Andrew S. Azman AU - Xinhua Chen AU - Junyi Zou AU - Yuyang Tian AU - Ruijia Sun AU - Xiangyanyu Xu AU - Yani Wu AU - Wanying Lu AU - Shijia Ge AU - Zeyao Zhao AU - Juan Yang AU - Daniel T. Leung AU - Daryl B. Domman AU - Hongjie Yu Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/09/08/2021.09.06.21263152.abstract N2 - Background Genomic surveillance has shaped our understanding of SARS-CoV-2 variants, which have proliferated globally in 2021. Characterizing global genomic surveillance, sequencing coverage, the extent of publicly available genomic data coupled with traditional epidemiologic data can provide evidence to inform SARS-CoV-2 surveillance and control strategies.Methods We collected country-specific data on SARS-CoV-2 genomic surveillance, sequencing capabilities, public genomic data, and aggregated publicly available variant data. We divided countries into three levels of genomic surveillance and sequencing availability based on predefined criteria. We downloaded the merged and deduplicated SARS-CoV-2 sequences from multiple public repositories, and used different proxies to estimate the sequencing coverage and public availability extent of genomic data, in addition to describing the global dissemination of variants.Findings Since the start of 2021, the COVID-19 global epidemic clearly featured increasing circulation of Alpha, which was rapidly replaced by the Delta variant starting around May 2021 and reaching a global prevalence of 96.6% at the end of July 2021. SARS-CoV-2 genomic surveillance and sequencing availability varied markedly across countries, with 63 countries performing routine genomic surveillance and 79 countries with high availability of SARS-CoV-2 sequencing. Less than 3.5% of confirmed SARS-CoV-2 infections were sequenced globally since September 2020, with the lowest sequencing coverage in the WHO regions of Eastern Mediterranean, South East Asia, and Africa. Across different variants, 28-52% of countries with explicit reporting on variants shared less than half of their variant sequences in public repositories. More than 60% of demographic and 95% of clinical data were absent in GISAID metadata accompanying sequences.Interpretation Our findings indicated an urgent need to expand sequencing capacity of virus isolates, enhance the sharing of sequences, the standardization of metadata files, and supportive networks for countries with no sequencing capability.Evidence before this study On September 3, 2021, we searched PubMed for articles in any language published after January 1, 2020, using the following search terms: (“COVID-19” OR “SARS-CoV-2”) AND (“Global” OR “Region”) AND (“genomic surveillance” OR “sequencing” OR “spread”). Among 43 papers identified, few papers discussed the global diversity in genomic surveillance, sequencing, public availability of genomic data, as well as the global spread of SARS-CoV-2 variants. A paper from Furuse employed the publicly GISAID data to evaluate the SARS-CoV-2 sequencing effort by country from the perspectives of “fraction”, “timeliness”, and “openness”. Another viewpoint paper by Case Western Reserve University’s team discussed the impediments of genomic surveillance in several countries during the COVID-19 pandemic. The paper as reported by Campbell and colleagues used the GISAID data to present the global spread and estimated transmissibility of recently emerged SARS-CoV-2 variants. We also found several studies that reported the country-level genomic surveillance and spread of variants. To our knowledge, no research has quantitatively depicted the global SARS-CoV-2 genomic surveillance, sequencing ability, and public availability extent of genomic data.Added value of this study This study collected country-specific data on SARS-CoV-2 genomic surveillance, sequencing capabilities, public genomic data, and aggregated publicly available variant data as of 20 August 2021. We found that genomic surveillance strategies and sequencing availability is globally diverse. Less than 3.5% of confirmed SARS-CoV-2 infections were sequenced globally since September 2020. Our analysis of publicly deposited SARS-CoV-2 sequences and officially reported number of variants implied that the public availability extent of genomic data is low in some countries, and more than 60% of demographic and 95% of clinical data were absent in GISAID metadata accompanying sequences. We also described the pandemic dynamics shaped by VOCs.Implications of all the available evidence Our study provides a landscape for global sequencing coverage and public availability extent of sequences, as well as the evidence for rapid spread of SRAS-CoV-2 variants. The pervasive spread of Alpha and Delta variants further highlights the threat of SARS-CoV-2 mutations despite the availability of vaccines in many countries. It raised an urgent need to do more work on defining the ideal sampling schemes for different purposes (e.g., identifying new variants) with an additional call to share these data in public repositories to allow for further rapid scientific discovery.Competing Interest StatementH.Y. has received research funding from Sanofi Pasteur, GlaxoSmithKline, Yichang HEC Changjiang Pharmaceutical Company, and Shanghai Roche Pharmaceutical Company. None of those research funding is related to COVID-19. All other authors report no competing interests.Funding StatementThis study was funded by Key Program of the National Natural Science Foundation of China (grant no. 82130093 to HJY) and the US National Institutes of Health (R01 AI135115 to DTL and ASA; KL2TR001448 to DD).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:No ethics body are needed in this study.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll datasets generated and analysed are available in the Article and Appendix. Any additional information is available from the lead contact upon request. ER -