Using simulation to assess the potential effectiveness of implementing screening at national borders during international outbreaks of influenza, SARS, Ebola virus disease and COVID-19

The effectiveness of screening travellers for signs of infection during times of international disease outbreak is contentious, especially as the reduction of the risk of disease importation can be very small. Border screening typically consists of arriving individuals being thermally scanned for signs of fever and/or completing a survey to declare any possible symptoms, and while more thorough testing typically exists, these would generally prove more disruptive to deploy. In this paper, we utilise epidemiological data and Monte Carlo simulation to calculate the potential success rate of deploying border screening for a range of diseases (including the current COVID-19 pandemic) in varying outbreak scenarios. We negate the issue of testing precision by assuming a perfect test is used; our outputs then represent the best-case scenario. We then use these outputs to briefly explore the types of scenarios where the implementation of border screening could prove most effective. Our models only considers screening implemented at airports, due to air travel being the predominant method of international travel. Primary results showed that in the best-case scenario, screening has the potential to detect 46.4%, 12.9% and 4.0% of travellers infected with influenza, SARS and ebola respectively, while screening for COVID-19 could potentially detect 12.0% of infected travellers. We compare our results to those already in the published literature.


Introduction
In 1347, the international trade routes throughout medieval Europe and Asia made possible one of the most devastating pandemics recorded. The "Black Death" was an outbreak of bubonic plague, (caused by the bacterium, Yersinia pestis) which is currently believed to have been Asiatic in origin [1]. From Asia, it is thought to have made its way into the Crimea, where it is assumed to have infected fleas residing on rats. These rats then boarded merchant ships heading for Western Europe. The disease continued to spread, eventually becoming established in every country constituting modern Europe; the pandemic ultimately causing the death of 30-50% of the European population (representing 24-40 million deaths) [2].
Since then, international travel has become far more commonplace with travel times being fractional in comparison. This rise in rapid international travel has brought with it new challenges with regards to safeguarding public health. The ability to travel between almost any two points on the planet within 24 hours has markedly increased the vulnerability of national populations to emerging and re-emerging infectious diseases, therefore providing the potential for epidemics to rapidly develop into pandemics [3]- [5]. As such, when an outbreak does occur, populations look to their governments to safeguard their nation's public health.
to reduce the risk of importation and local establishment of the disease. Indirect border screening, which typically involves measures such as thermal cameras or temperature tests combined with questionnaires to check travellers for signs of infection-related fever at port of entry/departure, is one such contingency. In 2005, the World Health Organisation (WHO) recommended in its International Health Regulations [6] that all WHO States should have the capability to implement screening at international points of entry during times of outbreak, although the specific method of screening that should be implemented is not specified. In the time since, thermal screening has become one of the most commonly deployed methods of reducing the risk of disease importation, being used during the SARS, bird flu and ebola outbreaks of 2003, 2009 and 2014 respectively. Although widely used during the 2003 SARS outbreak, thermal screening failed to detect the majority of travellers infected with the disease, and as such, the effectiveness of border screening as a whole was called into question [7], [8]. The failure of screening during this outbreak was likely due to the fact that most infected travellers were prodromal (infected without presenting symptoms) when tested. The effectiveness of border screening is therefore reliant on the probability that infected travellers are exhibiting detectable symptoms of infection by the time they are screened.
In this paper, we present a simple mechanistic model that represents the process of a traveller, infected with some known disease, attempting to undertake international travel and gain entry to some destination country where border screening is being enforced. The model is then run repeatedly utilising Monte Carlo simulation, capturing the stochastic nature of the various processes involved, to calculate the likelihood that an infected person would be presenting detectable symptoms upon arrival at the border of the destination country. From this, we thus infer the probability that border screening would be able detect an infected individual and use this as a metric to evaluate the effectiveness of border screening. To avoid the issue of testing accuracy, we assume a "perfect" test is being used, so our results then report an upper-limit to this value. We demonstrate this method by using a Python package that implements the described model for simulated outbreaks of influenza, SARS and Ebola virus disease (Ebola). We also apply our model to the current COVID-2019 outbreak to assess whether border screening is an effective tool to reduce the risk of disease importation. Lastly, using the output from our model, we attempt to briefly infer the outbreak situations in which border screening could make an effective method of reducing disease importation risk. Although studies currently exist which investigate the effectiveness of border screening for specific disease outbreaks using methods similar to our described model, we attempt to tackle the problem in generality and deliver consensus on the wider use of border screening; we attempt to address the disparity between results obtained in this work compared to other published sources.

Assumptions
The assumptions we make are a simplification of reality. However, the benefit of using a mechanistic model is that users may add more complicated assumptions without too much inconvenience. This allows for situations where more realistic conditions require study. However, the assumptions made for this instance of our model are as follows: • Persons who have become symptomatic before boarding their flight do not fly • All persons travelling only take directs flight from their country of origin to the destination country • Border screening only detects travellers that are symptomatic; prior to symptom onset they are not detectable . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint • All people attempting to cross the border are screened • Screening does not produce false negative • The number of infected persons remains constant throughout the simulation (transmission and death are neglected) • Where prior population information is not known, we assume that the time of infection likelihood for each traveller, exp , is uniformly distributed over a given interval prior to boarding their flight • The incubation period distribution of the disease of interest, inc , is known or well approximated • Screening detects all infected persons who have become symptomatic • Infected people do not attempt to "game" the system by concealing their symptomology

Methods
The model used here is as parsimonious as possible. Each model run simulates a person attempting to travel from country A on a direct flight to country B. At some time prior to their flight, this person has become infected with the disease of interest. Should they be sufficiently well at the time of boarding, i.e. not displaying symptoms, they will then take the flight and try to cross country B's border.
Monte Carlo simulation is then used to execute this model a large number of times (n) to consider the stochastic nature of each of these events. In each simulated journey each person is randomly allocated three values: the time between exposure to infection and getting on the flight ( exp ), the time from exposure to displaying symptoms i.e. the incubation period ( inc ), and the flight time from country A to country B (t flight ).
These times are sampled from distributions exp , inc and flight . The distributions are either previously known or approximated from prior epidemiological and travel data. For each person, the model then tests to see how far they will progress towards crossing the border into country B according to the following: • If the incubation time is less than the time from infection to that person's flight, they are deemed to have become symptomatic before boarding their flight and therefore do not travel (either by not being well enough to fly, or being picked up at exit screening); they exit the model being recorded as a non-flier • Or else, if their incubation time is less than the length of time from infection until that person's flight lands in country B, they are deemed to have become symptomatic in transit. They will therefore be detectable at country B's border; by our assumptions, that person is detected on arrival and exits the model being recorded as a border detection • Else, their incubation time has exceeded the time taken from them to become infected, fly from country A to country B, and cross the border; they then exit the model being recorded as a non-detected case.
We visualise this below: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint Figure 1: Depiction of how individuals in model are evaluated. Note that person alpha becomes symptomatic prior to flight, hence becoming a non-flier, and person beta becomes symptomatic in transit, and will thus be detected on arrival. While person gamma does not become symptomatic by the time they arrive, and therefore enters country B as an undetected case.
We have included a pseudo-code breakdown of the algorithm is included in the supplementary text. We can then use Monte Carlo simulation to approximate the screening success rate.
A Python package implementing the above model (which has been used to calculate the presented values in the next section) has been produced by the author and made openly available online [9].

Simulated outbreaks
We now apply the model described above to simulated outbreaks of ebola, SARS and influenza. It is worth recalling our assumptions, that all travellers are screened, and border screening detects all symptomatic travellers, reflects the ideal situation. As such, our results represent the best possible outcome from implementing border screening.
For each disease, we have 9 different scenarios made up by combinations of three differing inputs for the exposure scenario, and three differing inputs for the distribution of flight times. Longer times could be considered, but in this model system, after 2 weeks the vast majority of individuals would not be able to fly. Our exposure scenarios are described by: • A uniform distribution between 0 and 72 hours before flight (emulating a scenario where travellers who are infected could have only been infected very recently, such as during a business trip) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint • A uniform distribution between 0 and 168 hours before flight (emulating a scenario where travellers who are infected may have become infected any time over the past week, such as during a short break or conference) • A uniform distribution between 0 and 336 hours before flight (emulating a scenario where travellers who are infected may have become infected any time over the past 2 weeks, such as during a long trip or where travellers are residents of country A) Inputs for flight time are also described by: • A uniform distribution between 3 and 5 hours (emulating a scenario where the disease outbreak has originated/established in a country connected by short haul flights) • A uniform distribution between 9 and 11 hours (emulating a scenario where the disease outbreak has originated/established in country connected by medium haul flights) • A uniform distribution between 15 and 17 hours (emulating a scenario where the disease outbreak has originated/established in a country connected by long haul flights) Shorter flight times were not included as screening was even less effective. The scenarios were each simulated using 100,000 Monti-Carlo runs.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint 6 | P a g e  [10], SARS [11] and Influenza [12]  These results show a very low success rate across all the scenarios for Ebola and SARS. This only approaches 15% in the most advantageous situations for screening. This finding is in line with previously work [8], [13].
In contrast, the calculated success rates for influenza are higher and suggest that there may be some benefit to screening for this disease. The difference here is driven by the markedly shorter mean incubation period for influenza compared to that of Ebola and SARS. This is explored in greater detail in the "Model analysis" section.

COVID-2019
Since the initial report of an unidentified pneumonia causing disease in the city of Wuhan, China, COVID-2019 has gone on to infect over 7.29 million people worldwide, which has resulted in the deaths of over 413,000 people [14]. China's government sought to contain the outbreak early on by closing all major transportation hubs (such as airports and train stations) in Hubei, the parent province of Wuhan, being the epicentre and where the greater number of cases had then been detected. Since then however, with the continued progression of the outbreak, it has become evident that this was not sufficient to completely stop the spread.
We apply our model to a scenario in the early stages of the pandemic where air-transport was still available, to assess whether the border screening of passengers arriving from other countries where COVID-19 had established would present an effective method to reduce the number of imported cases. Our calculated values are presented below:

Figure 3: Calculated success rate of screening travellers infected with COVID-19. Parameterisations for incubation distribution taken from published literature[15] (see supplementary text)
These results indicate that the success rates for screening for COVID-19 are similar to the calculated values for SARS, so we may also draw similar conclusions. In the best of all these scenarios, we may expect to detect 12% of infected persons attempting to cross the border, and at worst, less than 1%.
In the real world, where screening is unlikely to be 100% effective these numbers are likely to be smaller still.

Analysis of model
Within the construction of our model, the success rate of border screening is inextricable linked to the characteristics of distribution for the incubation period. This makes sense. Border screening is . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint only effective in situations where the number of people detected at the border approaches the number of total infected people who board the outbound flight.
Consider a scenario where a disease causes a person to become symptomatic exactly one hour after exposure. Then, according to our assumption that persons who become symptomatic prior to their flight do not fly, infected travellers who do make it onto the plane will have at most an hour before becoming symptomatic. But as in our chosen scenarios, all flights take at least 3 hours, we could expect all travellers who do fly to display detectable symptoms upon arrival. In contrast, if the disease had an exceedingly long incubation period, such that persons become symptomatic from the disease exactly one month after infection, then in none of our scenarios would we have expected border screening to detect any infected travellers. Of course, in reality, the distribution of incubation periods is likely to fall somewhere in between these two extreme events. But from this we can deduce, keeping all other factors fixed, border screening is likely to be more effective for diseases whose incubation distribution is in the same order as flight times and/or more positively skewed. The graph below shows the detection rates for each of the scenarios increases as the length of the incubation period increases (Ebola, SARS and then influenza):

Figure 4: Plot of success rate of screening against disease (here, a proxy for average incubation period)
Within out model we have also included a distribution for the exposure window, information which essentially encapsulates the movement of persons prior to their flight. As a typically unobserved event, it would be very difficult to obtain enough data to allow a meaningful attempt to describe the likelihood of time of infection prior to boarding a flight. Aside from the lack of data, this would be a highly complex function that would probably comprise multitudes of hidden variables. In lieu of this, we made the simplifying assumption that this likelihood was uniformly distributed. This choice represents a situation where the populations are evenly mixing prior to their flight; a common assumption to make. If we plot the border screening success rate against the different exposure window scenarios we get figure 5.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Figure 5: Plot of screening success rate against infection window distribution
What we immediately notice, is the way that the success rate for identifying cases of influenza is constant across all three scenarios, while the success rate for SARS remains essentially constant between the second and third scenarios. What this represents is that after a certain point, only a fixed proportion of travellers who board their flights are going to be displaying symptoms by the time they arrive at country B's border; corresponding to the probability of the convolution of the exposure and flight time distributions being less that the randomly assigned value for incubation period. For each scenario then, this is only dependent upon the incubation period such that the longer the average incubation period, the lower the success rate, which shows why influenza so quickly reaches its fixed point and Ebola takes much more time to reach this equilibrium. This is in fact an intuitive result; increasing the exposure window only really means that infected persons intending to travel have on average a longer time to wait until their flight; in turn implying a higher chance that they will develop symptoms prior to boarding. However, aside from requiring all persons intending to undertake international travel be quarantined for some time frame prior to departure, it is highly unlikely that authorities could do anything to reduce the likelihood of persons becoming infected prior to boarding their flight.
The last distribution implemented in our model is used to represent the range of flight times from the infected group's country of origin to the destination country. This is again something that there is little direct control over. Looking to the output of our model though, we can see how this variation implies a clear pattern: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint For the limited results we have, it appears there is some linearity in the relationship between border screening success and the average flight time. This again makes sense; the longer each infected person is on the plane, the longer they have to become symptomatic. In effect, the flight time is a micro-quarantine. The gradient for each disease displays some nature of convergence which may be related to the convergence to the fixed detection ratio seen when varying the exposure window distribution.
It should also be noted that our model, as discussed, does not specifically address the issues of exit screening. This was to again allow our model to deal only with the best-case scenario (sick people not flying is equivalent to saying that exit screening is 100% effective).

Discussion
From this model, we see that the success rate of screening infected travellers is positively correlated with travel time and inversely correlated with the average incubation period. The implication of this is that screening is best utilised during outbreaks of diseases which have a short incubation period.
Results also suggest that screening is proportionally more effective when the travellers are subject to a longer exposure window, as this permits each infected traveller more time (on average) to incubate prior to boarding their flight, but this is simply an effect of prevalence rates.
Much literature already exists which explores the effectiveness of implementing screening for various diseases. As such, we take the opportunity to compare our results against that which has already been published. In an editorial published in 2014 [13], an approach similar to the one described in this paper (including the assumption that screening detects all symptomatic cases) was used to calculate the expected detection rate of ebola infected travellers, that had being randomly infected before travelling for some period of time. The linearity in the results presented infer that, for every hour travelled, one may expect an additional 0.76% of infected travellers who managed to board their flights to be detected at country B's border. If we assume linearity in our results, we find that, by our model, every extra hour travelled increases rate of detection but 0.2505%, 0.0633% and 0.0028% for the 0 -14 day, 0 -7 day and 0 -3 day exposure windows respectively, representing a significant difference. This difference most likely stems from the distribution of incubation periods, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted July 11, 2020. . which in [13], is quoted as having a mean of 9.1 days with a SD of 7.3, with no indication as to how this is distributed, compared to our distribution (taken from [16]) which has mean of 12.5 days and a SD of 4.35.
We now consider the another paper by Pitman et al. [17] which uses similar modelling to assess the effectiveness of implementing border screening for both influenza and SARS. Here, the authors only consider the scenario where travellers have become infected at a fixed time of 1, 2, and 3 days before their flight, so we may only accurately compare this to our calculated values for SARS and influenza in the scenario where we use UNIF(0, 72) as the distribution of time of infection. Our calculated success rate for SARS largely agrees with these published values (noticing that in the referenced paper they have used continents to represent average flight time, from which we infer Europe: 2-4 hours (1% success rate), Middle East: 4-6 hours (3% success rate), Africa and North America: 5-7 hours (4% success rate), and East Asia: 11-13 hours (6% success rate)). In contrast though, across the given scenarios, there is some discrepancy between success rates of screening for influenza (Europe: 4%, Middle East: 10%, Africa: 12%, North America: 13%, and East Asia: 17%). Due to the variability between different strains of flu, this could very well be due to the authors using some other distribution to describe the incubation period (an explicit description of which is not included in the article; though the method of derivation is referenced, the data used for this derivation are not available). Relatedly, another article attempts to infer the success rate of screening implemented in New South Wales, Australia, during the 2009 H1N1 influenza outbreak [18]. The authors here forecast a success rate of 6.67%. This paper did not use modelling, but instead took the ratio of cases successfully detected whilst trying to cross the border, against the number of H1N1 cases detected in Australia that were deemed to have acquired the disease abroad. Stemming from real data, the differences between success rates reported in this paper and ours could originate from a number of places. In particular, the sensitivity of screening to detect symptomatic cases is likely to be less than 100%. Also notable is the small number of cases in the study; 45 influenza cases (thus, only 3 detected by screening).
Lastly, we consider a more recent publication which assesses the effectiveness of implementing border screening at an early stage of the COVID-19 outbreak [19]. In this paper, the authors estimated that screening for COVID-19 had a potential success rate of 46%. This again represents a sizeable discrepancy with our results. This seems to stem from a combination of allowing for infected traveller to assume a asymptomatic state (and thus not detectable under any circumstance), assuming the use of imperfect exit screening and the fact that this model allows for another physical traveller state, where only here do severe symptoms become observable. By considering this advanced stage of disease, the model then allows for a (on average) longer window of time from point of infection that travellers may be able to successfully board their flight. Some that board their flight will in fact be symptomatic (allowed by the permitting of error in exit screening) and will therefore be detected on entry screening. Although this model allows for imperfect exit screening (where we assumed that no symptomatic travellers boarded their flight, here the authors allow a probability that some still manage to undertake travel), it is not believed that this represents a sizeable impact to have led to the vast divergence in results. The authors report that roughly 17% of cases are asymptomatic (further supported by subsequent publications [20]) and this would therefore account for majority of the variation. But while a recent literature review has found that asymptomatic cases may make up anywhere from 5% to 80% of total infections [21] the transmissibility of COVID-19 between asymptomatic cases and susceptible persons is unclear. If transmission from asymptomatic cases is a rare event, then as they would then be unlikely to cause further infections, one could argue whether it would simply be better to discount asymptomatic cases from further consideration.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.10.20150664 doi: medRxiv preprint

Conclusion
In this paper, we have presented a model which uses Monte Carlo simulation to attempt to estimate the upper-limit of the success rate of implementing border screening (of any form) to detect travellers infected with various diseases. While similar work has previously been produced for specific disease outbreaks, the model presented here can be take forward, and applied to any disease for which the distribution of incubation periods is either well known or approximated, and can therefore be applied to assess the effectiveness of screening in many other situations. In addition, the described model also takes inputs describing typical flight time, as well as allowing users to give some detail about traveller's risk of infection prior to their flight (although, as highlighted in the model analysis section, this may be somewhat difficult to accurately describe). This then provides a general tool which can be used to assess the effectiveness of implementing border screening as a contingency method, across a broad spectrum of scenarios.
The model is mechanistic in nature, opting to use simulation rather than distribution theory to obtain success rates. This method makes the model much more flexible, allowing the implementation of additional events to be easily accommodated. The model is presented as a Python package with a permissive license so that other may use and modify this work to their own circumstances.
We have also examined the effectiveness of implementing screening during outbreaks of four diseases (including COVID-19) across scenarios where the outbreak has established in countries of varying distance away. Notably, the output of our model indicated exceptionally low rates of success across all diseases and scenarios. This would therefore indicate that other methods to reduce risk of importation are needed. We then compared our model output to values reported in existing literature, attempting to account for any discrepancies.
Using our model may remove the need to reproduce significant amounts of work for marginally different disease scenarios, while providing public health teams with a flexible tool to assess the impact of implementing screening which can quickly be adjusted and run with minimal effort.