Emergence of evidence during disease outbreaks: lessons learnt from the Zika virus outbreak

Introduction: Outbreaks of infectious diseases trigger an increase in scientific research and output. Early in outbreaks, evidence is scarce, but it accumulates rapidly. We are continuously facing new disease outbreaks, including the new coronavirus (SARS-nCoV-2) in December 2019.The objective of this study was to describe the accumulation of evidence during the 2013-2016 Zika virus (ZIKV) outbreak in the Pacific and the Americas related to aetiological causal questions about congenital abnormalities and Guillain-Barre syndrome. Methods: We hypothesised that the temporal sequence would follow a pre-specified order, according to study design. We assessed 1) how long it takes before findings from a specific study design appear, 2) how publication of preprints could reduce the time to publication and 3) how time to publication evolves over time. Results: We included 346 publications published between March 6, 2014 and January 1, 2019. In the 2013-2016 ZIKV outbreak, case reports, case series and basic research studies were published first. Case-control and cohort studies appeared between 400-700 days after ZIKV was first detected in the region of the study origin. Delay due to the publication process were lowest at the beginning of the outbreak. Only 4.6% of the publications was available as preprints. Discussion: The accumulation of evidence over time in new causal problems generally followed a hierarchy. Preprints reduced the delay to initial publication. Our methods can be applied to new emerging infectious diseases.

1 Introduction 1 Outbreaks of infectious diseases trigger an increase in scientific research and output. Early in 2 outbreaks, evidence is scarce, however, it accumulates rapidly as time progresses. As we are continuously 3 facing new disease outbreaks, as we do at the moment with the emergence of the new coronavirus 4 (SARS-nCoV-2), understanding the accumulation of evidence is vital. Here, we describe the 5 accumulation of evidence during the Zika virus outbreak and summarize the lessons learnt. 6 Causality is a principal theme in epidemiological research. Establishing that an exposure causes a 7 specific health outcome is based on evidence and may inform guidance about public health measures. 8 The concepts and types of evidence required to conclude that an association is causal are the subject of 9 ongoing debate. Vandenbroucke proposed a hierarchy of evidence based on the best chance for discovery 10 and explanation of phenomena [1]. Observations published in case reports and case series, or findings in 11 data and literature drive discovery. Verification of these discoveries happens in observational studies and 12 in randomized controlled trials, given that exposures can be randomized. The value of evidence used for 13 public health guidance is thought to follow an inverse pattern of the hierarchy for discovery; here, case reports and other anecdotal evidence is considered to be evidence that provides the least certainty on an 15 effect or association. Cohort studies, randomized controlled trials and case-control studies are considered 16 to provide evidence with a higher certainty (Figure 1). 17 The Zika virus (ZIKV) outbreak in the Pacific and the Americas between 2013--2016 presented 18 several aetiological causal questions. In 2013-2014, ZIKV caused an outbreak in French Polynesia [2,3]. 19 During this period, investigators documented some severe neurological conditions, including 40 people 20 with Guillain-Barré Syndrome (GBS). GBS is usually a rare sporadic condition. Often triggered by 21 infection, an autoimmune response affects the peripheral nerves, leading ascending paralysis, which can 22 be fatal if it involves the respiratory nerves [4]. At the time, the reports did not attract much attention 23 and the investigators refrained from making a causal connection because dengue was also circulating at 24 the time [3]. In November 2015, the ministry of health in Brazil reported a cluster of births affected by 25 microcephaly in the north east. Microcephaly is a birth defect, indicative of impaired brain development, 26 which can be caused by congenital infection. At around the same time, ZIKV had been identified for the 27 first time in Brazil [5]. In December 2015, the Pan American Health Organization announced heightened 28 surveillance owing to an "increase of congenital anomalies, Guillain-Barré syndrome, and other ZIKV circulation [7]. Retrospective assessment of the French Polynesia outbreak identified an increase in 33 adverse congenital outcomes as well [8]. The PHEIC and the extensive outbreak catalysed the research 34 on ZIKV. Early public health guidance about the prevention of ZIKV infection and its potential 35 consequences was based on limited evidence, however [9].

36
Systematic reviews were developed to address the PHEIC recommendation for research about the 37 causal relationships between ZIKV infection and adverse congenital outcomes, including microcephaly 38 and between ZIKV and autoimmune outcomes, including GBS [10]. The reviews organized the findings 39 around a 'causality framework' with ten dimensions derived from those proposed by Bradford 40 Hill [10,11]. An expert committee reviewed the evidence collected by these systematic reviews up to May 41 2016 and reached the conclusion that "the most likely explanation of the available evidence" was that 42 ZIKV is a cause of adverse congenital outcomes and a trigger of GBS [12]. This review has been kept up 43 to date as a living systematic review, by periodically incorporating new results [13,14]. The additional 44 evidence has reinforced the conclusions of causality.

45
A temporal sequence for the emergence of evidence was already hypothesised during the planning of 46 the systematic reviews in early 2016 ( Figure 1). Acknowledging that 'astute observations' of new causes 47 of disease often start an aetiological investigation [15], case reports and case series were eligible for 48 inclusion in the systematic reviews. These study designs are often excluded from systematic reviews 49 because they are the lowest level of the "hierarchy of evidence" that applies to evaluation research. 50 Vandenbroucke proposed a reverse hierarchy for discovery in which 'anecdotal' forms of evidence are at 51 the top [1]. Cross-sectional, case-control and retrospective follow-up studies follow because they are 52 quickest study designs that include a control group. Prospective cohort studies take longer to set up and 53 RCTs only provide additional information if a treatment or vaccine is available. In addition to 54 epidemiological studies, basic and clinical laboratory science start early in the search for causes. 55

2/14
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 18, 2020.  Figure 1. Hypothetical accumulation of evidence over time, by study design.
The objective of this study was to examine the body of evidence that was used to establish the causal 56 relation between ZIKV infection and adverse outcomes. We hypothesised that the temporal sequence 57 would follow Figure 1.

68
For all included studies, we retrieved the received and published date, the location of the study and 69 the study design (Table 1). For epidemiological studies, we extracted the study location and the number 70 of patients with both exposure and the outcome according to the case definition provided in the 71 publication. We excluded modelling studies or surveillance and outbreak reports. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 18, 2020. . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 18, 2020.  78 We defined the publication date as the earliest date the publication was available. If the publisher's 79 website did not state an exact date, we assigned the 'epub' date from MEDLINE via PubMed or 'page 80 created' date for specific online journals (EID and MMWR). We also recorded the date the manuscript 81 was received by the publisher (received date) and the date of acceptance for publication (accepted date). 82 83 We defined the time to publication as the time between introduction of ZIKV virus in the region and 84 the publication date. For basic research studies, many of which were done in countries unaffected by 85 ZIKV, we assigned the time to publication as the time between 1 February 2016 (the PHEIC declaration) 86 and the first available publication date. The delay resulting from the publication process (publication delay) was defined as the time between 89 the 'received date' and the first available publication date. We assessed 1) how long it takes before findings from a specific study design appear, 2) how 92 publication of preprints could reduce the time to publication and 3) how time to publication evolves over 93 time. We provide a descriptive analysis of the total time to publication and the publication delay by 94 5/14 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 18, 2020. . https://doi.org/10.1101/2020.03. 16.20036806 doi: medRxiv preprint publication. Of these durations, we provide the median and interquartile range (IQR) by study design 95 and over time. We compare the publication delay by three month period (quarter). 96 3 Results

97
During the period of the first review [10] and subsequent update [13], we screened 2,847 publications. 98 During the remaining period, between January 7, 2017 and January 1, 2019, we screened an additional 99 2,594 publications. Figure 3 shows the evolution of the volume of the published ZIKV research over time 100 is provided. Year Number/month is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 18, 2020.     Figure 5A shows the comparison of publications published between the PHEIC and the end of the was a result of a retrospective study looking back at the French Polynesia outbreak [8]. The median total 116 time to publication was longer for more robust study designs (cohort studies, case-control studies). We 117 see a similar pattern for epidemiological studies if we consider the data up to January 1, 2019 and 118 consider the time to publication between the regional introduction of ZIKV and the publication date 119 ( Figure 5B). 120

8/14
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 18, 2020.

Strengths and weaknesses of the study 138
A strength of this study is the pre-specified hypothesis about the time to publication of aetiological 139 research and the use of data from systematic reviews that had screened and selected studies that 140 addressed the causal relationship between ZIKV infection and its adverse outcomes. We calculated 141 additional measures related to the time to publication of research, including delays due to the 142 publication, and thus the time that could have been gained by publishing preprints.

143
The limited information extracted about each study was a limitation. The time between introduction 144 of ZIKV and the actual publication of a research study is dependent on factors both within and between 145 study designs. There is substantial variation in the time to publication within the study designs. We did 146 not quantify several factors that likely influence this duration such as the size of the outbreak, the 147 research capacity or outbreak preparedness. Small outbreaks or small population sizes limit the 148 opportunity to enrol sufficient patients with adverse outcomes, and unless involved in 149 multi-centre/multi-region studies, these regions are less likely to produce high quality epidemiological 150 studies. The same holds true for regions with limited research capacity, such as appropriate diagnostic 151 facilities and expertise. Outbreak preparedness likely increased over time, with funding increasing after 152 the PHEIC, meaning that initiation of studies started relatively late for regions that were affected 153 earliest by the outbreak. Countries that were affected later in time by ZIKV, might have already had 154 surveillance and diagnostic methodology in place.

155
The publication delay is a proxy measure, which could not be calculated for all studies; for only 59% 156 10/14 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 18, 2020. . https://doi.org/10.1101/2020.03. 16.20036806 doi: medRxiv preprint (204/346) studies the "received date" was provided. It is unclear whether these data are missing at 157 random. Furthermore, the recorded publication delay could only be calculated for the journal in which a 158 study was published. The true publication delay includes the time taken up by rejection and 159 resubmission. The publication date also ignores dissemination of the findings at conferences or within 160 collaborations. However, here the information is only available to a limited audience. The timing of 161 ZIKV introduction is also a proxy measure, which does not capture the first actual case, but signals the 162 moment at which the health authorities and the research community noted the introduction in that 163 region and thus serves its purpose as a proxy for when research start intensifying. Phylogenetic data 164 suggest that ZIKV was often introduced months before formal detection and notification [18]. The sequence of emergence of evidence about causality was not exactly as hypothesised (Figure 1).

167
While case reports and case series were the first types of study to be published, findings from animal 168 research were also published quickly. This finding might have been influenced by the more frequent use 169 of preprints to disseminate laboratory research than clinical science [19]. In our study, the time taken to 170 publication of case-control studies and cohort studies was similar, particularly for studies of congenital 171 outcomes. Case-control studies are widely assumed to be quicker to organise and conduct than cohort 172 studies [1,15]. In the ZIKV outbreak, one case-control study about GBS was published soon after the 173 PHEIC declaration because it used data already collected from the earlier ZIKV outbreak in French

174
Polynesia. An important consideration is the short duration of pregnancy. The cohorts that were fastest 175 to produce results, were cohorts that were already in place for other disease (Dengue, influenza) [20].   In a disease outbreak with adverse outcomes that are new or incompletely understood, the full 203 spectrum of evidence needs to be assessed to establish causality. Early in an outbreak, we need anecdotal 204 evidence to drive discovery and explanation [1]. Studies across the different scientific disciplines are 205

11/14
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted March 18, 2020. problems based on study design and timing of evidence will provide more insight in how evidence 212 accumulates. The ZIKV outbreak in the Americas was unique by its size; making rare  As the SARS-nCoV-2 outbreak continues to unfold, we will apply the same methodology as discussed 221 here to keep track of the accumulation of evidence, the delay in publication and the use of pre-print 222 publishing. The accumulation of evidence over time in new causal problems seems to follow a hierarchy where 225 case reports and case series were rapidly followed by basic research. During the ZIKV outbreak, robust 226 epidemiological studies, such as case-control studies and cohort studies, took 400-700 days to appear.

227
Causal inference based on a wide spectrum of evidence is therefore essential for early public health 228 guidance in emerging causal problems. Publishing preprint does reduce the delay, and especially in 229 epidemiological research this is an underused tool.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 18, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 18, 2020.