Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Missing data matters in participatory syndromic surveillance systems: comparative evaluation of missing data methods when estimating disease burden

View ORCID ProfileKristin Baltrusaitis, Craig Dalton, Sandra Carlson, Laura F. White
doi: https://doi.org/10.1101/2021.05.11.21256420
Kristin Baltrusaitis
1Center for Biostatistics in AIDS Research, Harvard T.H. Chan School of Public Health, Boston, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kristin Baltrusaitis
  • For correspondence: kbaltrus@sdac.harvard.edu
Craig Dalton
2Hunter New England Population Health, Wallsend, Australia
3Hunter Medical Research Institute, Newcastle, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sandra Carlson
2Hunter New England Population Health, Wallsend, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laura F. White
4Department of Biostatistics, Boston University School of Public Health, Boston University, Boston, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Introduction Traditional surveillance methods have been enhanced by the emergence of online participatory syndromic surveillance systems that collect health-related digital data. These systems have many applications including tracking weekly prevalence of Influenza-Like Illness (ILI), predicting probable infection of Coronavirus 2019 (COVID-19), and determining risk factors of ILI and COVID-19. However, not every volunteer consistently completes surveys. In this study, we assess how different missing data methods affect estimates of ILI burden using data from FluTracking, a participatory surveillance system in Australia.

Methods We estimate the incidence rate, the incidence proportion, and weekly prevalence using five missing data methods: available case, complete case, assume missing is non-ILI, multiple imputation (MI), and delta (δ) MI, which is a flexible and transparent method to impute missing data under Missing Not at Random (MNAR) assumptions. We evaluate these methods using simulated and FluTracking data.

Results Our simulations show that the optimal missing data method depends on the measure of ILI burden and the underlying missingness model. Of note, the δ-MI method provides estimates of ILI burden that are similar to the true parameter under MNAR models. When we apply these methods to FluTracking, we find that the δ-MI method accurately predicted complete, end of season weekly prevalence estimates from real-time data.

Conclusion Missing data is an important problem in participatory surveillance systems. Here, we show that accounting for missingness using statistical approaches leads to different inferences from the data.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The FluTracking surveillance system is funded by the Australian Government Department of Health.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

FluTracking was approved by the Hunter New England Human Research Ethics Committee on 13 April 2006 (06/04/22/4.03). On 2 October 2009, FluTracking was incorporated into routine national influenza surveillance, and so ethics approval was no longer required and considered closed for FluTracking.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Contact Flutracking{at}flutracking.net to request data. Data requests will be actioned within resource constraints.

  • ABBREVIATIONS

    AU
    Australia
    CI
    Confidence Interval
    COVID-19
    Coronavirus 2019
    FNY
    Flu Near You
    ILI
    Influenza-Like Illness
    IP
    Incidence Proportion
    IR
    Incidence Rate
    MAR
    Missing at Random
    MCAR
    Missing Completely at Random
    MI
    Multiple Imputation
    MICE
    Multivariate Imputation by Chained Equations
    MMWR
    Morbidity and Mortality Weekly Report
    MNAR
    Missing Not at Random
    NRMSE
    Normalized Root Mean Square Error
    US
    United States of America
    WHO
    World Health Organization
    WP
    Weekly Prevalence
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted May 16, 2021.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Missing data matters in participatory syndromic surveillance systems: comparative evaluation of missing data methods when estimating disease burden
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Missing data matters in participatory syndromic surveillance systems: comparative evaluation of missing data methods when estimating disease burden
    Kristin Baltrusaitis, Craig Dalton, Sandra Carlson, Laura F. White
    medRxiv 2021.05.11.21256420; doi: https://doi.org/10.1101/2021.05.11.21256420
    Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    Missing data matters in participatory syndromic surveillance systems: comparative evaluation of missing data methods when estimating disease burden
    Kristin Baltrusaitis, Craig Dalton, Sandra Carlson, Laura F. White
    medRxiv 2021.05.11.21256420; doi: https://doi.org/10.1101/2021.05.11.21256420

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Infectious Diseases (except HIV/AIDS)
    Subject Areas
    All Articles
    • Addiction Medicine (228)
    • Allergy and Immunology (504)
    • Anesthesia (110)
    • Cardiovascular Medicine (1238)
    • Dentistry and Oral Medicine (206)
    • Dermatology (147)
    • Emergency Medicine (282)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (531)
    • Epidemiology (10020)
    • Forensic Medicine (5)
    • Gastroenterology (499)
    • Genetic and Genomic Medicine (2452)
    • Geriatric Medicine (236)
    • Health Economics (479)
    • Health Informatics (1642)
    • Health Policy (752)
    • Health Systems and Quality Improvement (636)
    • Hematology (248)
    • HIV/AIDS (533)
    • Infectious Diseases (except HIV/AIDS) (11864)
    • Intensive Care and Critical Care Medicine (626)
    • Medical Education (252)
    • Medical Ethics (74)
    • Nephrology (268)
    • Neurology (2280)
    • Nursing (139)
    • Nutrition (352)
    • Obstetrics and Gynecology (454)
    • Occupational and Environmental Health (536)
    • Oncology (1245)
    • Ophthalmology (377)
    • Orthopedics (134)
    • Otolaryngology (226)
    • Pain Medicine (157)
    • Palliative Medicine (50)
    • Pathology (324)
    • Pediatrics (730)
    • Pharmacology and Therapeutics (312)
    • Primary Care Research (282)
    • Psychiatry and Clinical Psychology (2280)
    • Public and Global Health (4832)
    • Radiology and Imaging (837)
    • Rehabilitation Medicine and Physical Therapy (491)
    • Respiratory Medicine (651)
    • Rheumatology (285)
    • Sexual and Reproductive Health (238)
    • Sports Medicine (227)
    • Surgery (267)
    • Toxicology (44)
    • Transplantation (125)
    • Urology (99)