TY - JOUR T1 - Algorithmic discovery of dynamic models from infectious disease data JF - medRxiv DO - 10.1101/19012724 SP - 19012724 AU - Jonathan Horrocks AU - Chris T. Bauch Y1 - 2020/01/01 UR - http://medrxiv.org/content/early/2020/03/19/19012724.abstract N2 - Theoretical models are typically developed through a deductive process where a researcher formulates a system of dynamic equations from hypothesized mechanisms. Recent advances in algorithmic methods can discover dynamic models inductively– directly from data. Most previous research has tested these methods by rediscovering models from synthetic data generated by the already known model. Here we apply Sparse Identification of Nonlinear Dynamics (SINDy) to discover mechanistic equations for disease dynamics from case notification data for measles, chickenpox, and rubella. The discovered models provide a good qualitative fit to the observed dynamics for all three diseases, However, the SINDy chickenpox model appears to overfit the empirical data, and recovering qualitatively correct rubella dynamics requires using power spectral density in the goodness-of-fit criterion. When SINDy uses a library of second-order functions, the discovered models tend to include mass action incidence and a seasonally varying transmission rate–a common feature of existing epidemiological models for childhood infectious diseases. We also find that the SINDy measles model is capable of out-of-sample prediction of a dynamical regime shift in measles case notification data. These results demonstrate the potential for algorithmic model discovery to enrich scientific understanding by providing a complementary approach to developing theoretical models.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis research was funded by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) to CTB. The funders had no role in the work.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe code used to generate the results is publicly available on Github. The infectious disease data for measles, rubella and chickenpox can be obtained from the International Infectious Disease Data Archive. https://github.com/jonathanhorrocks/SINDy-data http://iidda.mcmaster.ca/ ER -