TY - JOUR T1 - Demonstrating an approach for evaluating synthetic geospatial and temporal epidemiologic data utility: Results from analyzing >1.8 million SARS-CoV-2 tests in the United States National COVID Cohort Collaborative (N3C) JF - medRxiv DO - 10.1101/2021.07.06.21259051 SP - 2021.07.06.21259051 AU - Jason A. Thomas AU - Randi E. Foraker AU - Noa Zamstein AU - Philip R.O. Payne AU - Adam B. Wilcox AU - the N3C Consortium Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/07/08/2021.07.06.21259051.abstract N2 - Objective To evaluate whether synthetic data derived from a national COVID-19 data set could be used for geospatial and temporal epidemic analyses.Materials and Methods Using an original data set (n=1,854,968 SARS-CoV-2 tests) and its synthetic derivative, we compared key indicators of COVID-19 community spread through analysis of aggregate and zip-code level epidemic curves, patient characteristics and outcomes, distribution of tests by zip code, and indicator counts stratified by month and zip code. Similarity between the data was statistically and qualitatively evaluated.Results In general, synthetic data closely matched original data for epidemic curves, patient characteristics, and outcomes. Synthetic data suppressed labels of zip codes with few total tests (mean=2.9±2.4; max=16 tests; 66% reduction of unique zip codes). Epidemic curves and monthly indicator counts were similar between synthetic and original data in a random sample of the most tested (top 1%; n=171) and for all unsuppressed zip codes (n=5,819), respectively. In small sample sizes, synthetic data utility was notably decreased.Discussion Analyses on the population-level and of densely-tested zip codes (which contained most of the data) were similar between original and synthetically-derived data sets. Analyses of sparsely-tested populations were less similar and had more data suppression.Conclusion In general, synthetic data were successfully used to analyze geospatial and temporal trends. Analyses using small sample sizes or populations were limited, in part due to purposeful data label suppression -an attribute disclosure countermeasure. Users should consider data fitness for use in these cases.Competing Interest StatementAll authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/downloads/coi_disclosure.docx and declare: Authors JAT, ABW, RF, and PP received financial support from the National Center for Advancing Translational Sciences, National Institutes of Health, through grant number U24TR002306 disbursed to their affiliated institutions for the submitted work; author NZ is an employee of MDClone; this manuscript underwent National Covid Cohort Collaborative (N3C) publication review described at https://covid.cd2h.org/publication-review; the institution RF and PP are affiliated with (Washington University in St. Louis) is a customer of MDClone; all authors declare no other relationships or activities that could appear to have influenced the submitted work.Funding StatementAuthors JAT, ABW, RF, and PP received financial support from the National Center for Advancing Translational Sciences, National Institutes of Health, through grant number U24TR002306 disbursed to their affiliated institutions for the submitted work. The analyses described in this publication were conducted with data or tools accessed through the NCATS N3C Data Enclave covid.cd2h.org/enclave and supported by NCATS U24 TR002306. Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:This study was approved by the Washington University and University of Washington Internal Review Boards. The N3C data transfer to NCATS is performed under a Johns Hopkins University Reliance Protocol # IRB00249128 or individual site agreements with NIH. The N3C Data Enclave is managed under the authority of the NIH; information can be found at https://ncats.nih.gov/n3c/resources.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe entirety of code used in this analysis is contained within a single Palantir Foundry Code Workbook using a saved Spark environment to preserve required software versions and dependencies. The code workbook and source data have been stored within the National Covid Cohort Collaborative (N3C) enclave (https://covid.cd2h.org/enclave) so that they may inform and be reused in future validation work. To view National Covid Cohort Collaborative (N3C) Data Enclave & Data Access Requirements, please navigate to the N3C website. Data was provided from the following institutions: Stony Brook University - U24TR002306, University of Oklahoma Health Sciences Center - U54GM104938: Oklahoma Clinical and Translational Science Institute (OCTSI), West Virginia University - U54GM104942: West Virginia Clinical and Translational Science Institute (WVCTSI), University of Mississippi Medical Center - U54GM115428: Mississippi Center for Clinical and Translational Research (CCTR), University of Nebraska Medical Center - U54GM115458: Great Plains IDeA-Clinical & Translational Research, Maine Medical Center - U54GM115516: Northern New England Clinical & Translational Research (NNE-CTR) Network, Wake Forest University Health Sciences - UL1TR001420: Wake Forest Clinical and Translational Science Institute, Northwestern University at Chicago - UL1TR001422: Northwestern University Clinical and Translational Science Institute (NUCATS), University of Cincinnati - UL1TR001425: Center for Clinical and Translational Science and Training, The University of Texas Medical Branch at Galveston - UL1TR001439: The Institute for Translational Sciences, Medical University of South Carolina - UL1TR001450: South Carolina Clinical & Translational Research Institute (SCTR), University of Massachusetts Medical School Worcester - UL1TR001453: The UMass Center for Clinical and Translational Science (UMCCTS), University of Southern California - UL1TR001855: The Southern California Clinical and Translational Science Institute (SC CTSI), Columbia University Irving Medical Center - UL1TR001873: Irving Institute for Clinical and Translational Research, George Washington Children's Research Institute - UL1TR001876: Clinical and Translational Science Institute at Children's National (CTSA-CN), University of Kentucky - UL1TR001998: UK Center for Clinical and Translational Science, University of Rochester - UL1TR002001: UR Clinical & Translational Science Institute, University of Illinois at Chicago - UL1TR002003: UIC Center for Clinical and Translational Science, Penn State Health Milton S. Hershey Medical Center - UL1TR002014: Penn State Clinical and Translational Science Institute, The University of Michigan at Ann Arbor - UL1TR002240: Michigan Institute for Clinical and Health Research, Vanderbilt University Medical Center - UL1TR002243: Vanderbilt Institute for Clinical and Translational Research, University of Washington - UL1TR002319: Institute of Translational Health Sciences, Washington University in St. Louis - UL1TR002345: Institute of Clinical and Translational Sciences, Oregon Health & Science University - UL1TR002369: Oregon Clinical and Translational Research Institute, University of Wisconsin-Madison - UL1TR002373: UW Institute for Clinical and Translational Research, Rush University Medical Center - UL1TR002389: The Institute for Translational Medicine (ITM), The University of Chicago - UL1TR002389: The Institute for Translational Medicine (ITM), University of North Carolina at Chapel Hill - UL1TR002489: North Carolina Translational and Clinical Science Institute, University of Minnesota - UL1TR002494: Clinical and Translational Science Institute, Children's Hospital Colorado - UL1TR002535: Colorado Clinical and Translational Sciences Institute, The University of Iowa - UL1TR002537: Institute for Clinical and Translational Science, The University of Utah - UL1TR002538: Uhealth Center for Clinical and Translational Science, Tufts Medical Center - UL1TR002544: Tufts Clinical and Translational Science Institute, Duke University - UL1TR002553: Duke Clinical and Translational Science Institute, Virginia Commonwealth University - UL1TR002649: C. Kenneth and Dianne Wright Center for Clinical and Translational Research, The Ohio State University - UL1TR002733: Center for Clinical and Translational Science, The University of Miami Leonard M. Miller School of Medicine - UL1TR002736: University of Miami Clinical and Translational Science Institute, University of Virginia - UL1TR003015: iTHRIVL Integrated Translational health Research Institute of Virginia, Carilion Clinic - UL1TR003015: iTHRIVL Integrated Translational health Research Institute of Virginia, University of Alabama at Birmingham - UL1TR003096: Center for Clinical and Translational Science, Johns Hopkins University - UL1TR003098: Johns Hopkins Institute for Clinical and Translational Research, University of Arkansas for Medical Sciences - UL1TR003107: UAMS Translational Research Institute, Nemours - U54GM104941: Delaware CTR ACCEL Program, University Medical Center New Orleans - U54GM104940: Louisiana Clinical and Translational Science (LA CaTS) Center, University of Colorado Denver, Anschutz Medical Campus - UL1TR002535: Colorado Clinical and Translational Sciences Institute, Mayo Clinic Rochester - UL1TR002377: Mayo Clinic Center for Clinical and Translational Science (CCaTS), Tulane University - UL1TR003096: Center for Clinical and Translational Science, Loyola University Medical Center - UL1TR002389: The Institute for Translational Medicine (ITM), Advocate Health Care Network - UL1TR002389: The Institute for Translational Medicine (ITM), OCHIN - INV-018455: Bill and Melinda Gates Foundation grant to Sage Bionetworks Additional data partners who have signed DTA and data release pending: The Rockefeller University - UL1TR001866: Center for Clinical and Translational Science, The Scripps Research Institute - UL1TR002550: Scripps Research Translational Institute, University of Texas Health Science Center at San Antonio - UL1TR002645: Institute for Integration of Medicine and Science, The University of Texas Health Science Center at Houston - UL1TR003167: Center for Clinical and Translational Sciences (CCTS), NorthShore University HealthSystem - UL1TR002389: The Institute for Translational Medicine (ITM), Yale New Haven Hospital - UL1TR001863: Yale Center for Clinical Investigation, Emory University - UL1TR002378: Georgia Clinical and Translational Science Alliance, Weill Medical College of Cornell University - UL1TR002384: Weill Cornell Medicine Clinical and Translational Science Center, Montefiore Medical Center - UL1TR002556: Institute for Clinical and Translational Research at Einstein and Montefiore, Medical College of Wisconsin - UL1TR001436: Clinical and Translational Science Institute of Southeast Wisconsin, University of New Mexico Health Sciences Center - UL1TR001449: University of New Mexico Clinical and Translational Science Center, George Washington University - UL1TR001876: Clinical and Translational Science Institute at Children's National (CTSA-CN), Stanford University - UL1TR003142: Spectrum: The Stanford Center for Clinical and Translational Research and Education, Regenstrief Institute - UL1TR002529: Indiana Clinical and Translational Science Institute, Cincinnati Children's Hospital Medical Center - UL1TR001425: Center for Clinical and Translational Science and Training, Boston University Medical Campus - UL1TR001430: Boston University Clinical and Translational Science Institute, The State University of New York at Buffalo - UL1TR001412: Clinical and Translational Science Institute, Aurora Health Care - UL1TR002373: Wisconsin Network For Health Research, Brown University - U54GM115677: Advance Clinical Translational Research (Advance-CTR), Rutgers, The State University of New Jersey - UL1TR003017: New Jersey Alliance for Clinical and Translational Science, Loyola University Chicago - UL1TR002389: The Institute for Translational Medicine (ITM), New York University - UL1TR001445: Langone Health's Clinical and Translational Science Institute, Children's Hospital of Philadelphia - UL1TR001878: Institute for Translational Medicine and Therapeutics, University of Kansas Medical Center - UL1TR002366: Frontiers: University of Kansas Clinical and Translational Science Institute, Massachusetts General Brigham - UL1TR002541: Harvard Catalyst, Icahn School of Medicine at Mount Sinai - UL1TR001433: ConduITS Institute for Translational Sciences, Ochsner Medical Center - U54GM104940: Louisiana Clinical and Translational Science (LA CaTS) Center, HonorHealth - None (Voluntary), University of California, Irvine - UL1TR001414: The UC Irvine Institute for Clinical and Translational Science (ICTS), University of California, San Diego - UL1TR001442: Altman Clinical and Translational Research Institute, University of California, Davis - UL1TR001860: UCDavis Health Clinical and Translational Science Center, University of California, San Francisco - UL1TR001872: UCSF Clinical and Translational Science Institute, University of California, Los Angeles - UL1TR001881: UCLA Clinical Translational Science Institute, University of Vermont - U54GM115516: Northern New England Clinical & Translational Research (NNE-CTR) Network, Arkansas Children's Hospital - UL1TR003107: UAMS Translational Research Institute https://covid.cd2h.org/enclave https://covid.cd2h.org/enclave-checklist ER -