## Abstract

Marburg virus disease is an acute haemorrhagic fever caused by Marburg virus. Marburg virus is zoonotic, maintained in nature in Egyptian fruit bats, with occasional spillover infections into humans and nonhuman primates. Although rare, sporadic cases and outbreaks occur in Africa, usually associated with exposure to bats in mines or caves, and sometimes with secondary human-to-human transmission. Outbreaks outside of Africa have also occurred due to importation of infected monkeys. Although all previous Marburg virus disease outbreaks have been brought under control without vaccination, there is nevertheless the potential for large outbreaks when implementation of public health measures is not possible or breaks down. Vaccines could thus be an important additional tool and development of several candidate vaccines is under way. We developed a branching process model of Marburg virus transmission and investigated the potential effects of several prophylactic and reactive vaccination strategies in settings driven primarily by multiple spillover events as well as human-to-human transmission. Our results show a low basic reproduction number of 0.81 (95% CI: 0·08–1·83) despite a high case fatality ratio, reported elsewhere. Of six vaccination strategies explored, a combination of ring and targeted vaccination of high-risk groups was generally most effective, with a probability of controlling potential outbreaks of 0.88 (95% CI: 0.85 - 0.91) compared with 0.65 (0.60 - 0.69) for no vaccination, especially if the outbreak is driven by zoonotic spillovers and the vaccination campaign initiated as soon as possible after onset of the first case.

## Introduction

Marburg virus disease (MVD) is an acute haemorrhagic fever caused by Marburg virus (genus *Marburg marburgvirus*, family *Filoviridae*), affecting humans and non-human primates (1–4). Marburg virus is zoonotic, maintained in nature in Egyptian fruit bats (*Rousettus aegyptiacus*), which are found across Africa (5). Although rare, sporadic cases and outbreaks occur, usually associated with exposure in mines or caves inhabited by colonies of Egyptian fruit bats (5–13). Secondary human-to-human transmission may occur through direct exposure to blood and body fluids or contaminated surfaces.

There have been 15 recognized MVD outbreaks to date, beginning in 1967 when infected green monkeys from Uganda were imported to Germany and Yugoslavia for harvesting of their tissues for vaccine production (3,14) (Table 1). Excluding laboratory infections, the exposures of all index cases of outbreaks since then have occurred in Africa (5–13), with some outbreaks driven by recurring virus spillover from animals to humans and others primarily by human-to-human transmission. Figure 1 and Table A1 show the MVD case numbers observed during each of these outbreaks.

No licensed vaccine for MVD currently exists, although several are under development (4). All previous outbreaks were controlled when transmission chains ended either naturally or through the introduction of public health and infection control measures (7,15). Nevertheless, the 2004 outbreak in Angola, which registered 374 cases and 329 deaths (case fatality of 88 percent), illustrates the serious and explosive potential of Marburg virus. Furthermore, even in the smaller outbreaks, the high case fatality ratio could potentially be mitigated by vaccination (4).

The aims of this study were to estimate key epidemiological parameters of MVD, such as the reproduction number and serial interval, and use this information to parameterise a model that is used to assess the impact of different vaccination strategies to control outbreaks.

## Methods

### Data

We used linelist data from all except one of the previous 15 outbreaks to estimate the serial interval of MVD. The one exception was the Angola outbreak that occurred between 2004-2005; here, because no linelist was available, we used case numbers reported periodically by the World Health Organisation (reported in (16), for instance).

From the linelist data, we identified discernable infector-infectee pairs, obtained the difference between the dates of infection of each pair and fit a gamma distribution to these differences. We also used the linelist data to obtain the number of zoonotic introductions seen in each outbreak and, together with knowledge of the duration of outbreak, calculated the rate of introductions.

### Rate of zoonotic introductions

Since several outbreaks were driven primarily by zoonotic introductions, while others were largely caused by human-to-human transmission (Figure 1), we estimated rates of zoonotic introductions to represent both these types of outbreaks.

For outbreaks dominated by contact with animals, we used data from the 1998-2000 outbreak in the Democratic Republic of the Congo (DRC) (6), which was driven by miners being infected through spillovers from bats. It was reported that 27% of infected miners from this outbreak had contact with another infected individual (6), from which we infer that 73% of all infected miners were spillover cases. Given that the outbreak lasted 2 years (with the first case identified in October 1998 and the last in September 2000 (6)), we divided the number of spillover cases by this duration to estimate the rate of introductions during the DRC outbreak. We use this as a typical rate of introductions for spillover-driven outbreaks. Other outbreaks likely involved a single spillover event subsequently driven by human-to-human transmission, typified by the large 2004 outbreak in Angola. To obtain the rate of introductions, we therefore divided the number of spillover cases by the duration of these outbreaks.

### Time from first case to interventions

For each outbreak, we estimated the date on which interventions were put in place. During earlier outbreaks this was simply when the disease was acknowledged as being dangerous and highly transmissible, prompting changes in clinical, laboratory and infection prevention and control practices (14), or patients who showed symptoms consistent with other viral hemorrhagic fever and were treated accordingly, for instance through application of case isolation and barrier nursing (17). For later outbreaks we used the day on which response teams were deployed to the region of the outbreak as the intervention date (while recognizing that local control efforts were often already underway). We calculated the median time delay between onset of the first case and the beginning of interventions across all outbreaks. However, intervention during several outbreaks, including those in Angola (2004-2005) (16,18), DRC (1998-2000) (6) and Uganda (2012) (19) took place several weeks after the median. Hence, as a sensitivity analysis, we took the 75th percentile of this delay to intervention and modelled this scheme.

### Factors affecting outbreak size

The number of cases in each MVD outbreak is presumed to be dependent on several factors, including the number of zoonotic introductions, delay from first case to intervention and calendar year in which the outbreak occurred. We used a Poisson regression with these covariates coded as integers. The impact of armed conflict was also noted as a possible factor for the two largest MVD outbreaks, in DRC and Angola (6,18,20). However, as conflict in both these regions had officially ended shortly before the outbreaks occurred, we also performed Poisson regression on a second model, with all four of the covariates above, including the occurrence of armed conflict as a binary variable.

### Branching process model

We used a branching process to model MVD transmission over time. New infections generated at any time, *t*, are governed by the force of infection *λ*_{t}, which is determined by previous case incidence *y*_{s} (*s* = 1, …, *t*-1), the serial interval distribution (denoted by *w*, its probability mass function), and the reproduction numbers *R*_{s} as (21):
New secondary cases at time *t* are then drawn from a Poisson distribution so that:
Equation (1) shows that the reproduction number *R*_{s} is allowed to vary over time. This is used to distinguish, in any given outbreak, two phases: a first one, during which transmission is maximum (*R*_{s} = *R*_{0}, the basic reproduction number), and a second one during which intervention reduces transmission by a factor *E*, the intervention efficacy, so that:
Intervention is defined, in this context, as the implementation of measures such as case isolation, contact tracing and barrier nursing.

The date at which interventions started reducing transmission is outbreak dependent, and was obtained through investigation of the respective, publicly available outbreak reports (3,6–12,16,17,19,22,23). Thus, in our model, the reproduction number decreases from *R*_{0} to *R*_{s} after this date.

The model also incorporates a constant rate of introductions *γ* in which newly introduced cases (presumed spill-over events) are also Poisson distributed.

### Parameter estimation

We used an Approximate Bayesian Computational (ABC) framework for estimating the basic reproduction number, *R*_{0,} and *E* for each outbreak separately. To do this, we first determined the delay to implementation of interventions, the duration of the outbreak and rate of introductions for each outbreak. The priors used were: *R*_{0} ∈ U(0,3) and *E* ∈ U(0,1). Parameter values were retained as part of the posterior sample if the total number of cases observed in a simulation was within 10% of the actual value. 5000 posterior samples were retained in this way per outbreak. Subsequently, the posterior samples for all 15 outbreaks were pooled together into a single posterior sample.

Along with the reproduction numbers, we also estimated the dispersion parameter, *k*, and serial interval. For *k*, we identified 18 chains of transmission from the outbreak in DRC (1998-2000) (6). We determined the number of secondary cases generated by each index case and fit a negative binomial distribution to these (see (24)). The size parameter of this distribution represents the dispersion parameter, *k* (24). For the serial interval, we identified 26 infector-infectee pairs from linelist data obtained from MVD outbreaks in DRC, Marburg, Belgrade and Uganda (2012) and fit a gamma distribution to the time period between the dates of onset for these infector-infectee pairs (25).

We then modified the model to include the effects of vaccination on transmission. Vaccination reduces the R value associated with each case by the vaccine efficacy *(VE*) corresponding to that case, on top of any reduction due to intervention efficacy (*E*). Six vaccination schemes were simulated:

*Prophylactic targeted vaccination of high-risk groups*. This scheme involves the vaccination of healthcare workers as well as individuals who reside near or work in mines or caves prior to the beginning or an outbreak. We estimate that, across all MVD outbreaks, approximately 6% of cases were healthcare workers and 12% were individuals living near or working in mines or caves (this excludes the outbreak in Angola due to lack of data).*Prophylactic mass vaccination*. The second scheme was prophylactic mass vaccination of the entire community prior to the outbreak.*Ring vaccination*. This is the first of the reactive vaccination strategies that we simulate. A proportion of contacts are vaccinated after the date of intervention. This proportion depends on how extensively case reporting is carried out, as well as on vaccination coverage.*Reactive targeted vaccination*. This scheme entails vaccination of the same high-risk groups as in scheme #1 above, but done reactively after an MVD outbreak has begun. In our model, vaccination is simulated only after the date of intervention.*Reactive mass vaccination*. This is mass vaccination simulated only after intervention has begun.*A combination of ring and reactive targeted vaccination schemes*. For all six strategies, we assumed that no waning of immunity occurred after vaccination, nor depletion of the susceptible population, given the low number of cases compared to the population of each affected community.

The vaccination parameters in our model were:
*VE*_{max} of a vesicular stomatitis virus (VSV)-based vaccine expressing the MARV glycoprotein (VSV-MARV) was found to be 100% in NHPs (2). As this is unlikely to be observed in the field, we adjusted downward to a *VE*_{max} of 90% in the base case.
The time from vaccination, when efficacy is 0%, to maximum efficacy was 7 days, from Phase I trials on NHPs of the MVD vaccine (2). This is reflected in our logistic curve representing the vaccination efficacy over time (Technical Appendix Figure A1).

### Vaccine coverage

We assumed that vaccination coverage would be 70% for ring and prophylactic mass vaccination, as well as targeted vaccination of vulnerable groups such as miners and healthcare workers. However, given logistical difficulties and possible vaccine shortages (26), we would expect a lower rate of coverage (assumed to be 50%) in the case of a mass vaccination strategy (both reactive and prophylactic). We increased or decreased these coverages by 20% in the sensitivity analyses.

### Time between vaccination and infection

Since to date no clinical trial of any potential MVD vaccine has taken place, we assumed that, amongst individuals who become infected, there was an average delay between vaccination and infection of 20 days (s.d. 5 days) for prophylactic and 9 days (s.d. 4 days) for reactive strategies. Data from a previous ring vaccination trial for an Ebola virus disease vaccine in Guinea showed an average delay from vaccination to subsequent infection (in those that got infected) in the rings of 5.7 days (s.d. 5.0 days) (27). We also examined this delay distribution as a sensitivity analysis.

### Logistic curve fit of vaccine efficacy

Timing is crucial for the success of vaccination strategies, particularly those that are reactive. Hence, we modelled the vaccine efficacy from vaccination to infection using a logistic curve of the form (Figure A1 in the Technical Appendix):
where *VE* is the vaccine efficacy as a function of time, *t*, in days and *A, B, C, D* and *E* are constants to be determined through a fitting procedure. A nonlinear least squares approach was used to fit this curve so that *VE* increases from 0 to *VE*_{max}, (the maximum *VE*) in 7 days (see (2)) and subsequently stays at this maximum.

In our model, each infected and vaccinated individual is assigned a vaccine efficacy, sampled from a particular region of this logistic curve. Which portion of this curve is sampled depends on the vaccination strategy employed. In general, prophylactic strategies result in higher vaccine efficiencies than reactive vaccination, since vaccines are administered before the beginning of an outbreak and so efficacy is assumed to have reached a maximum. We therefore took samples of *VE* from the top of the logistic curve (Technical Appendix). On the other hand, individuals vaccinated reactively in response to an outbreak are more likely to be infected before peak *VE* has been reached (here, we assume this period to be 7 days, based on (2)). Hence, we took samples of *VE* from a larger portion of the logistic curve (mainly its slope) in these scenarios (see Technical Appendix for further details).

### Forward simulations

Taking samples of *R*_{0} and *E* values from the pooled posterior, we then performed forward simulations to show the effects of these different vaccination schemes on potential outbreaks under both low and high introduction rates.

We compared the distributions of simulated case numbers after implementing the six vaccination strategies described above with the no-vaccination scheme. We also estimated the proportion of controlled outbreaks predicted under each scheme. We considered an outbreak to be controlled if, after simulating for 365 days, the force of infection (see Equation 1) was less than 0.05, considered negligibly low.

The code is available on https://github.com/GeorgeYQian/MVD-Branching-Process-Model-Repository

## Results

### Epidemiological parameters

The estimated rate of zoonotic introductions was 0.06 per day and 0.003 per day during the DRC and Angola outbreaks, respectively. Across all outbreaks, the median delay between onset of the first MVD case and beginning of interventions was 21 days.

The fitted gamma distribution of the serial intervals had a mean of 9.2 days and standard deviation of 4.4 days (Figure A2, Technical Appendix). The median value of *R*_{0} was 0.81 [95% CI: 0·08–1·83], while the median value of *R*_{t} was 0.30 [95% CI: 0·01–1·31] (Figure 2). Across all outbreaks, around half (52%) of our estimates of the intervention efficacies were larger than 50% (Figure 2). As a result, only 10% of the estimates of *R*_{t} were greater than 1. We estimated the value of *k* to be between 0.52 and 0.67 (Technical Appendix).

### Factors influencing outbreak size

The Poisson regression suggested that all factors that we investigated (number of introductions, delay to intervention, calendar year of outbreak and the occurrence of armed conflict) influenced the size of MVD outbreaks. The Akaike Information Criterion value for this model was 178 (Tables 4-6). When armed conflict was removed as a possible covariate, the Akaike Information Criterion value increased to 718 (Table 4).

### Simulations of vaccination strategies

The proportion of controlled outbreaks in the absence of any vaccination strategy was 0.91 (95% CI: 0.88 - 0.93) and 0.65 (CI: 0.60 - 0.69) when simulating low and high rates of introductions, respectively. Most vaccination strategies resulted in a significant increase in this proportion, in particular the combined ring and targeted strategy, with values of 0.99 (CI: 0.97 - 0.99) and 0.88 (CI: 0.85 - 0.91), and the prophylactic mass strategy, with values of 0.99 (CI: 0.97 - 0.99) and 0.83 (CI 0.7:9 - 0.86), for low and high rates of introductions, respectively (Figure 3a).

The median number of cases in the absence of any vaccination strategy was 3 (95% CI: 3-4) and 37 (CI: 34 - 40) for low and high rates of introductions, respectively. Under the low rate of introductions scenario, this median hardly changed when vaccines were used, while there was a small but significant decrease observed for all vaccination strategies when simulating a high rate of introductions, with the exception of the ring, reactive mass and reactive targeted vaccination schemes. In particular, the median number of cases under the prophylactic targeted strategy was 28 (CI: 27-29) (Figure 3b).

### Sensitivity analyses

Varying the vaccination parameters had the following effects:

Reducing the delay between vaccination and infection to 5.73 days (s.d. 5.03 days) did not result in any significant differences from baseline values in either the proportion of controlled outbreaks or the median number of cases across all simulated outbreaks (Tables 2 and 3).

After simulating a scenario with reduced coverage (20% less than baseline values) we found that, while no significant change in the median number of cases was observed, the proportion of controlled outbreaks decreased in the combined reactive and ring vaccination approach (Tables 2 and 3). This decrease was only apparent if the rate of introductions was high. Other vaccination approaches were not impacted by reducing vaccine coverage.

With increased coverage (20% greater than baseline values), all vaccination strategies performed significantly better than the no vaccination control, with both prophylactic approaches, as well as the combination of reactive targeted and ring vaccination performing best (Tables 2 and 3).

After simulating a scheme where the date of intervention was increased to 90 days after onset of the first case, there was a decrease in proportion of controlled outbreaks for reactive vaccination strategies at the higher introduction rate (Table 2). For instance, whereas 99% of outbreaks on average were controlled under a combined ring and reactive targeted vaccination scheme when interventions occurred after 21 days, this decreased to 84% with a delay of 90 days. This also translates to an increase in the median number of cases for these reactive vaccination approaches. Again taking the combined ring and reactive approach, we observed a small but significant increase from 31 (baseline) to 40 cases.

## Discussion

We combined and analysed data from each of the known MVD outbreaks to characterise key epidemiological parameters and assess the potential impact of a range of vaccination strategies. We found that the reproduction number of MVD in human populations has generally been relatively low 0.81 [95% CI: (0·08–1·83)], consistent with the small and mostly self-limited nature of most outbreaks. Our estimates of *R*_{0} are lower than that calculated by Ajelli and Merler (25) (1.59; 95% CI: 1.53-1.66). This is largely because their estimates were based on data from the Angola outbreak alone – the largest recorded outbreak – whereas ours include all of the available data. Our estimated serial interval (9.2 days with a standard deviation of 4.4 days) is comparable to the generation time estimated by Ajelli and Merler from non-human primates (25) (Table 1).

Although our estimates of the reproduction number after interventions are generally low, this is no guarantee that all future outbreaks will be limited in size. Our estimate of the dispersion parameter for *R*_{0} (0.52 - 0.67) indicates that there is the potential for superspreading events to occur. Large future outbreaks of MVD cannot be ruled out. Both the rate of introductions from the zoonotic reservoir and the speed of the response influence the final outbreak size. However, this is by no means an exhaustive list; as the Angolan outbreak showed, a large outbreak can arise from a very limited number of introductions.

There are likely many social, epidemiological and environmental factors that may influence outbreak size. For instance, the two largest MVD outbreaks recorded to date (DRC in 1998-2000 and Angola in 2004-2005) occurred in populations that had recently been affected by civil war (16,18,20), which likely resulted in fragile health systems unable to prevent or rapidly control outbreaks. A Poisson regression showed that the following factors were significant variables influencing outbreak size: number of introductions, delay to intervention, calendar year of outbreak and whether the affected community had recently been affected by armed conflict (Tables 4-6).

To help understand what role vaccines might play in controlling future outbreaks we developed a simple branching process model and parameterised it from our analyses of the epidemiology. As expected, vaccination generally increased the probability of outbreaks being controlled compared to no vaccination. Exceptions included reactive targeted vaccination when the rate of spill-over introductions is low, as well as mass vaccination strategies when only very low coverage (e.g. 30%) is achieved. Vaccination could also be expected to reduce the median outbreak size, though this reduction is often relatively small since the median number of cases for the no vaccination scheme is already low (3 when the rate of introduction is low, 37 when high).

Over the range of strategies and parameter values considered, a combined ring and targeted vaccination was generally the most effective option since it results in a high probability of control and a low median outbreak size. However, if there are few introductions, the added effect of targeted vaccination over ring vaccination alone is negligible. Nevertheless, the combined approach might still be preferred since the rate of spill-over introductions might be difficult to assess in real-time without comprehensive sequence data.

Although efforts are ongoing to develop MVD vaccines (4), the results of our study suggest that it may be difficult to carry out Phase 3 trials, since we predict that few cases will be observed in a typical outbreak, and these may well be rapidly controlled by other interventions. To counter this problem, the World Health Organisation has developed a Core Protocol approach that is designed to allow trial results to be combined across multiple outbreaks to accrue sufficient data and statistical power to assess vaccine effectiveness (26). The results of our model could be helpful in estimating how many cases and outbreaks may be necessary to include in such a longitudinal multi-outbreak study.

A major limitation of our study is the lack of data due to the infrequency of MVD outbreaks, with most being relatively small, and in many cases, scant availability of epidemiologic data. This paucity of data inevitably leads to wider confidence intervals, as evidenced by the large confidence intervals in the pooled posterior distribution for the reproduction number [0.08 - 1.53]. Another limitation is that, while we calculated and used a constant rate of zoonotic introductions, in reality, the rate often varies as a function of time. For example, the outbreak in the DRC appears to have been driven by seasonal introductions into miners (6). However, due to a lack of data on zoonotic introductions into specific persons, we opted for model simplicity and chose a constant rate for each outbreak.

We chose to use a simple branching process model that does not include, for instance, depletion of the susceptible population. Depletion of susceptibles may well be an important consideration, especially under a mass vaccination strategy, but we have assumed that a low percentage of the population will be mass vaccinated. Nor did we account for any possible waning of immunity post-vaccination. At present there is a lack of data on the effectiveness of vaccination - we have informed this part of our analysis by making broad and simplistic assumptions. Several vaccines are currently in the pipeline (4) and as results emerge we can update our work accordingly. Also, our study uses a branching process model that forces one case of MVD at time *t* = 0. However, both prophylactic vaccination approaches may, indeed, prevent many outbreaks from even beginning. These are not considered in our model.

Our study shows that various vaccination strategies can be effective in controlling outbreaks of MVD, with the best approach varying with the particular epidemiologic circumstances of each outbreak. Of course, many logistical and economic factors must be considered. Given the rarity and generally small size of MVD outbreaks, prophylactic mass vaccination of large populations is unlikely to be feasible or warranted. However, as has been proposed for vaccination for Ebola virus, vaccination for relatively infrequent but dangerous emerging infectious diseases might be incorporated into comprehensive vaccination for numerous diseases, serving as a driver of broader health systems strengthening (28). The rationale for this approach would be further strengthened by development of pan-filovirus vaccines, for which research is underway (33), especially if protection is long-lasting.

## Data Availability

All data produced in the present study are available upon reasonable request to the authors.

## Ethical Approval

Ethical approval for the study was granted by the London School of Hygiene & Tropical Medicine Ethics Committee (Reference Number 26566).

## Technical Appendix

https://docs.google.com/document/d/10RI5Fw3x4jB78sej17X1jGtUSRVV-cnm0GEl98VeXUI/edit

## Acknowledgements

This research is funded by the Department of Health and Social Care using UK Aid funding and is managed by the National Institute for Health and Care Research (VEEPED: PR-OD-1017-20002). The views expressed in this publication are those of the authors and not necessarily those of the Department of Health and Social Care.

## Footnotes

Some adjustments made to text for clarity; added a link to Github repository with R files to run the branching process model.