Population heterogeneity is a critical factor of the kinetics of the COVID-19 epidemics

The novel coronavirus pandemic generates extensive attention in political and scholarly domains. Its potentially lasting prospects, economic and social consequences call for a better understanding of its nature. The widespread expectations of large portions of the population to be infected or vaccinated before containing the COVID-19 epidemics rely on assuming a homogeneous population. In reality, people differ in the propensity to catch the infection and spread it further. Here, we incorporate population heterogeneity into the Kermack-McKendrick SIR compartmental model and show the cost of the pandemic may be much lower than usually assumed. We also indicate the crucial role of correctly planning lockdown interventions. We found that an efficient lockdown strategy may reduce the cost of the epidemic to as low as several percents in a heterogeneous population. That level is comparable to prevalences found in serological surveys. We expect that our study will be followed by more extensive data-driven research on epidemiological dynamics in heterogeneous populations.


Introduction
Because of the novelty and urgency of the situation, epidemiological models inform decision-making 1,7-13 in addressing the COVID-19 pandemic. Those models indicate high contagiousness of the virus and raise concerns about the majority of the population to be infected (if not vaccinated). The basic reproduction number ܴ of the pandemic at its beginning was estimated to be around 3 9,14-16 , which implies 1 െ 1 ܴ ⁄ , i.e., about 67 percent of the population must be infected or vaccinated before the infection may be controlled without lockdown measures. This conclusion has affected mitigation policies in many countries, it has also contributed to expectations of recurrent waves of the epidemic.
Those models, however, ignore varying social engagement, epidemic-awareness, and hygiene preparedness that, along with other factors, contribute to the varying propensity of contracting the disease and spreading it to others. Various reports suggest 10-20 percent of cases may be responsible for 80 percent of the COVID-19 transmissions [17][18][19] . These findings illuminate the fact that while the majority of people may barely contribute to the spread of the epidemics thanks to either limited social engagement or higher alertness and better hygiene, few others may become superspreaders infecting dozens of people. An essential practical conclusion from this conclusion was a call to aim the mitigation policies at superspreaders to reduce the basic reproduction number (average number of secondary infections per one initial infected person) and contain the spread of the infection. Another aspect of the heterogeneity, however, may demand to revise that conclusion and readdress the prospects of the pandemic and mitigation policies. The population heterogeneity is an essential player in the kinetics of the epidemic, because when the minority who contributes most to the spread of the virus contracts the disease and develops immunity, the outbreak may abruptly come to an end before the expected majority gets affected. Differential contagiousness also matters for how to manage the lockdown policies and whether to assume the recurrent waves of the epidemic after the lift of the social isolation measures or Autumn cooling. Furthermore, population heterogeneity may shorten the course of the outbreaks, because those with higher social engagement will also be the first to catch the infection.

Results
In Figure  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) reproduction number of three, the epidemics could have been checked after 67 percent of the population gets infected. In reality, that threshold may be surpassed thanks to the gained momentum of the spread of the infection. In the heterogeneous cases, the total numbers infected are also substantial (27.6 and 14.1 percent, respectively) but much lower than in the homogeneous case. The peak levels of the infected population are also much higher (56.1 percent) in the homogeneous population than in the heterogeneous ones (12.1 and 5.8 percent). Also note that the more heterogeneous is the population, the earlier is the peak of the epidemic. An intuition to this observation is that the faster infection (and recovery) of the superspreaders accelerates the epidemics in its early phase while slowing it down in a later phase.
In panes b-d, we present results for three timing options for the lockdown that lasts over 28 days and reduces the spread of the virus by 90 percent. When started too early (day 30, pane b), the lockdown leaves too many people susceptible to the virus and facilitates a substantial second wave. The total infected population is nearly the same as in the no-lockdown variant for the homogeneous population but considerably lower for the heterogeneous populations.
With a better timing of the lockdown, the long-term costs of the epidemics are much lower. The lockdown presented in pane c (starts in day 39) is optimal for the more heterogeneous population that experiences, with the optimal lockdown timing, no second wave (and the total number infected is minimal at 4.9 percent). That lockdown, however, is yet too early for the less heterogeneous population where a moderate second epidemic wave develops and leads to a total of 15.8 percent infected (a substantially higher cost as compared to the minimal cost of 11 percent associated with the lockdown starting in day 44). In the homogeneous case, the second wave is even higher, and almost everybody (93.3 percent) is, again, gets infected. Only a later lockdown that starts in day 55 (pane d) produces the optimal result for the homogeneous population (68.8 percent infected).
While the first wave of the epidemics in the heterogeneous cases is earlier and more compressed as compared to the homogeneous case, the second wave, on the contrary, is later and more stretched out. Even if beneficial in terms of a lower peak, an extended small second wave may misguide the policymaker about the long-term efficiency of the lockdown measures.
Heterogeneous scenarios show much lower long-term costs of the epidemics and peak levels of the infected as compared to the traditional homogeneous case. If the lockdown had been more selective, better protecting the non-spreading population, those numbers could have been even smaller. Indeed, the epidemics could, in principle, be contained after most of the superspreaders were infected (bringing total infected populations down to about 1. 5  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. .

Discussion
Expectations that about 70 percent of people may be infected before containing the pandemic were implicitly based on assuming population homogeneity. Contrary to those expectations, we show that the population heterogeneity may bring that threshold level down to as few as 14 percent with a similar basic reproduction number. Population heterogeneity, it appears, may even outweigh the vaccination in its importance as a factor checking the spread of the disease. We urgently need to fully understand the extent and nature of how people differ in susceptibility to the infection and the ability to spread it and appreciate that in our decision-making.
In the long run, a lower number of people infected means fewer causalities to the virus. In the short run, however, lockdown policies around the world take the capacity of the healthcare system into account too. In that context, it is notable that population . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. . heterogeneity also reduces the peak levels of the infected population (from 56 percent, as in the homogeneous case, to 6-12 percent).
Lockdown, when well-scheduled, is capable of substantially reducing the cost of the outbreak. The timing of the lockdown is crucial in all scenarios. When prematurely implemented, the lockdown leaves too large a portion of the population susceptible to the infection, which results in the second wave of the epidemic. In such cases, the epidemic may gain momentum and eventually lead to nearly the same total number of infected persons as in the case of no lockdown. The second wave appears to stretch over a more extended period in the heterogeneous cases, which may misguide policymakers in their assessment of the efficiency of the lockdown. Too late a lockdown, however, is also inefficient, because it allows for many avoidable infections.
In the optimal lockdown strategy, one should wait until the proportions susceptible fall to levels where the instantaneous reproduction number turns unity. After reaching that threshold, the lockdown measures should be implemented with maximal possible strength to cool off the epidemic' momentum and halt the further spread of the virus. To design such an optimal policy response, however, it is mandatory to understand the kinetics of the epidemic well and assess the threshold correctly. That includes accounting for the role of population heterogeneity.

SIR-type epidemiological models and their extensions play an important role in
designing policy responses to the COVID-19 and other epidemics. Our results imply that we should further extend those models to include different predispositions to catch and spread the infection.
With optimal lockdown strategy, the total number of infected people may be reduced to as low as five percent in the heterogeneous population. Notably, such level of prevalence is of the same magnitude as was found in serological surveys 6  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.25.20140442 doi: medRxiv preprint will also be less affected by lockdown measures. That may contribute to the selectivity of the first type and reduction of the eventual cost of the epidemic. On the other hand, closing down public workplaces, introducing strict social distancing, isolation, or public hygiene measures may affect more the superspreaders while having limited effect, if at all, on socially less engaged and initially more prepared people. That may form lockdown selectivity of the second type and exaggerate the cost of the epidemic. Which scenarios develop in reality needs urgently being examined while countries move into the postlockdown phases.
Long-term effects of the population heterogeneity reported here also call for revisiting the policy recommendations with respect to the superspreaders. The usual policy recommendation with respect to the superspreaders is to maximize lockdown efficiency among the superspreaders. Yet, we indicate that such a policy may delay but not prevent the second wave of the epidemic and spread, unnecessarily, the infection more into the non-spreaders population. We need to address this issue in designing social isolation policies.
The extent and type of population heterogeneity depend on many factors that need to be studied. Those include demographic factors such as age and sex, kinship structures and relations, household sizes and roles within them. Factors of heterogeneity also include biological predispositions, behavioral patterns (that, in turn, may depend on demographic circumstances, such as the presence of persons vulnerable to the disease in the household or kinship networks), educational, occupational, and income differentials, and others. A better understanding of these relations is instrumental in combating both the current urgency and other communicable diseases. To address those issues, however, we need representative and comparable statistics on how we differ in odds to catch and then to spread the virus. Such data are barely available, and we call statistical and healthcare agencies to urgently fill the gap in data on population heterogeneity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. .

Supplementary. Materials and Methods The heterogeneous SIR model
We incorporate population heterogeneity into the discrete version of the Kermack We assume a symmetric model where both the propensity of catching the virus and the propensity of spreading it are proportional to the communicability parameter ݇ . Hence, we model new infections as follows: In modeling the course of recovering, we trace the duration of the infection period for the infected people and assume every infected person to recover in time ߬ after getting infected. That is, the number of people recovering in the period ‫ݐ‬ equals: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020.
We neglect R-S transitions from recovered to the susceptible population because such transitions have not yet been reported to play a substantial role in the COVID-19 epidemics. We also neglect the fatality of the disease because we intend to highlight the primary effects of population heterogeneity upon the overall course of the epidemic.
Introducing R-S transitions, mortality, and more realistic demographics should pose no difficulty in future research.
Assuming the entire original population is susceptible, the number of secondary infections per one initially infected person of type ݆ over the communicability period ߬ may be found from (3) here, is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.25.20140442 doi: medRxiv preprint that halt the new infections completely. Eqs. (5) and (10) lead to the following closedform solution for the evolution of the susceptible population of type The higher the communicability parameter ݇ , the faster is the fall of the susceptible population in (11). That creates compositional change in the remaining susceptible population, a change that suppresses the communicability-weighted susceptible population ‫ܬ‬ ሺ ‫ݐ‬ ሻ and checks the spread of the epidemics.
In generating and interpreting results of simulation scenarios, it is useful to relate the model parameters to the commonly used basic reproduction number ܴ . To establish the relation, assume the initial distribution of infected people follows the model relation (3) and is proportional to the weighted populations of each type: That is, the basic reproduction number is the weighted average of the communicability parameter with weights equal to the weighted susceptible populations of each type.
At an advanced phase of the epidemic, substantial portions of the population move to infected or recovered compartments, and the instantaneous reproduction number of a new outbreak decreases to: , a new outbreak may be contained without a lockdown. subpopulation is set at such a level that the population-average basic reproduction . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.25.20140442 doi: medRxiv preprint number (12) equals 3. We form the heterogeneous populations in such a way that 10 or 20 percent of infected people are responsible for 80 percent of the further spread of the epidemic, similar to what was reported in the literature [17][18][19] . Assuming that ‫ݔ‬ percent of infected are responsible for ‫ݕ‬ percent of transmissions, the communicability parameter of non-spreaders (݇ ଵ ) and superspreaders may be found from (12) as: , in which case our model turns to the conventional SIR model. In Fig. S1, we present sizes of the three population compartments in four selected simulations for the homogeneous population: no lockdown intervention (pane (a)); lockdown reducing the . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. similarly timed (pane (c)) leads to near-optimal results. Indeed, such an 'optimality' of the lockdown ignores infection fatality and healthcare systems' capacity that has become a concern in many countries.
In Fig. S2 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. .  Table S1. Even with only half of the people being non-spreaders, and with no lockdown, the long-term cost of the epidemic and the peak number of the infected people decrease by more than 30 percent as compared to the homogeneous population. In an extreme case where 99.9 percent of people are non-spreaders, the long-term cost of the epidemic is only about five percent without any policy intervention. The peak level of the infected population also falls dramatically as the population heterogeneity increases.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 26, 2020. .  (pane (c)), and the optimal lockdown reducing the spread by 99.1 percent that starts at ‫ݐ‬ ൌ 5 7 (pane (d)). All lockdowns last for 14 days. Vertical axis: population size starting with 1000 original population. Horizontal axis: time in days from the original infection of 0.01 percent of people.

‫ݍ‬
is the strength parameter of the lockdown; ‫ܫ‬ is the eventual proportion of the population infected throughout the course of the epidemic.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.25.20140442 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.25.20140442 doi: medRxiv preprint