Reaching collective immunity for COVID-19: an estimate with a heterogeneous model based on the data for Italy

Background. At the current stage of COVID-19 pandemic, forecasts become particularly important regarding the possibility that the total incidence could reach the level where the disease stops spreading because a considerable portion of the population has become immune and collective immunity could be reached. Such forecasts are valuable because the currently undertaken restrictive measures prevent mass morbidity but do not result in the development of a robust collective immunity. Thus, in the absence of efficient vaccines and medical treatments, lifting restrictive measures carries the risk that a second wave of the epidemic could occur. Methods. We developed a heterogeneous model of COVID-19 dynamics. The model accounted for the differences in the infection risk across subpopulations, particularly the age-depended susceptibility to the disease. Based on this model, an equation for the minimal number of infections was calculated as a condition for the epidemic to start declining. The basic reproductive number of 2.5 was used for the disease spread without restrictions. The model was applied to COVID-19 data from Italy. Findings. We found that the heterogeneous model of epidemic dynamics yielded a lower proportion, compared to a homogeneous model, for the minimal incidence needed for the epidemic to stop. When applied to the data for Italy, the model yielded a more optimistic assessment of the minimum total incidence needed to reach collective immunity: 43% versus 60% estimated with a homogeneous model. Interpretation. Because of the high heterogeneity of COVID-19 infection risk across the different age groups, with a higher susceptibility for the elderly, homogeneous models overestimate the level of collective immunity needed for the disease to stop spreading. This inaccuracy can be corrected by the homogeneous model introduced here. To improve the estimate even further additional factors should be considered that contribute to heterogeneity, including social and professional activity, gender and individual resistance to the pathogen.


Methods
We developed a heterogeneous model of COVID-19 dynamics. The model accounted for the differences in the infection risk across subpopulations, particularly the age-depended susceptibility to the disease. Based on this model, an equation for the minimal number of infections was calculated as a condition for the epidemic to start declining. The basic reproductive number of 2.5 was used for the disease spread without restrictions. The model was applied to COVID-19 data from Italy.

Findings
We found that the heterogeneous model of epidemic dynamics yielded a lower proportion, compared to a homogeneous model, for the minimal incidence needed for the epidemic to stop. When applied to the data for Italy, the model yielded a more optimistic assessment of the minimum total incidence needed to reach collective immunity: 43% versus 60% estimated with a homogeneous model.

Interpretation
Because of the high heterogeneity of COVID-19 infection risk across the different age groups, with a higher susceptibility for the elderly, homogeneous models overestimate the level of collective immunity needed for the disease to stop spreading. This inaccuracy can be corrected by the homogeneous model introduced here. To improve the estimate even further additional factors should be considered that contribute to heterogeneity, including social and professional activity, gender and individual resistance to the pathogen.

Introduction
Most countries in the world are currently severely affected by the COVID-19 epidemic 1 . The undertaken anti-epidemic restrictions have been effective, but there remains the risk of the epidemic second wave after the restrictions are lifted and people return to their normal way of life. Indeed, the potential vaccines are only being explored 2-4 , so mass vaccination of the population will not start any time soon 5 . In these conditions, epidemic dynamics depend on the basic reproductive rate (that can be lowered by anti-epidemic measures) and the number of susceptible-exposed-infectious-recovered cases 6,7 .
As the number of infections increases (slowly with restrictive measures), a population immunity state can be reached where repeated occurrences of infection only lead to dilapidated chains of successive infections, not mass infections [8][9][10] . It is important to be able to forecast such an event when deciding to lift the anti-epidemic restrictions 11,12 .
Here we developed a mathematical model for assessing the minimum incidence of COVID-19 needed to reach collective immunity, which would assure that the epidemic cannot restart the cessation of quarantine measures. The key feature of our model is that it is heterogeneous, that is it accounts for the differences in the infection risk across affected subpopulations. This is important for COVID-19 because the severity of this disease strongly depends on age 1,13,14 . As such, our model offers more precise estimates compared to the commonly used homogeneous models.

Evidence before this study
We conducted a search of MedRxiv, PubMed, and BioRxiv for articles published in English from inception to May 9, 2020, with the keywords "COVID-19", "SARS-CoV-2","reproduction number", "R0", "age", "homogeneous model", and "heterogeneous model". While this search yielded several useful references regarding COVID-19 modeling, the basic reproduction number of this disease, and age-related heterogeneity, we did not find an approach similar to ours to modeling COVID-19 dynamics and estimating the total incidence and population immunity. Therefore, our study provides a novel model that estimates these metrics accurately and with a minimal number of model parameters.

Added value of this study
Our report the results obtained with a heterogeneous model, which is different from the commonly used homogeneous models of the epidemic dynamics. This is especially important for COVID-19, whose risks are strikingly age-dependent. With these modelling results, we contribute to the demanding issue of lifting anti-epidemic restrictions, the measure that could result in the second wave of the epidemic because of an insufficient level of collective immunity. In addition to assessing the minimum collective immunity level, our report provides a framework for incorporating additional sources of heterogeneity and for monitoring the epidemic dynamics for different subpopulations. As such, our model is an important tool that could be useful for making crucial decisions, which eventually will result in saving human lives.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Implications of all the available evidence
Criteria for lifting restrictions (when and how) are critical for the overall success of anti-epidemic measures. Our contribution to this issue is the assessment of the level of collective immunity that assures that the disease would nor restart when the restrictions are lifted. The assessment was made using a heterogeneous model that accounted for the age-related differences in COVID-19 infection risk. Future research should consider additional sources of heterogeneity, such as heterogeneity associated with social and professional activity, gender and individual resistance to the pathogen.

Methods
We reasoned our model the following way. The simplest Kermack-McKendrick susceptibleinfected-recovered (SIR) model of an epidemic 15 assumes that each member of the population can be in one of three states: susceptible, infected, recovered (or immune), and the probability of transition between the states is the same for all members of the population. The epidemic dynamics is then described by the differential equations: (1) where S and I are susceptible and infected subpopulations, respectively. The term R 0 is the basic reproductive number, that is the average number of people infected by one infected person with 100% susceptibility. Furthermore, I  is the recovery intensity for the infected people, 1/β is the average duration of the disease, and This representation can be further improved by the susceptible-exposed-infectiousrecovered (SEIR) model that accounts for the fact that there is a period when after being infected a person does not infect the others 6,7 . Additionally, not only the contagiousness but also the probability of recovery (or death) depends on the time elapsed since the infection. To introduce these factors, let J (t) be the intensity of the emergence of new cases, 0 J R SI  = .
Then the system dynamics are described by the equations: Anti-epidemic activities like social isolation can be factored in by considering R 0 to be timedependent rather than constant. Then, morbidity decreases if the recovery flow is higher than the infection flow: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 25, 2020. ; Taking 0 = 2.5 as an estimate for the basic reproductive number, the susceptible subpopulation, S, should be less than 40% for the epidemic to decline. In other words, 60% of the population should contract the disease.
This assessment, however, overestimates the expected morbidity because it does not take into account the presence of low and high-risk groups. Let h be an individual risk of infection, so that the average value of h for the population is 1 (

t S h t h c h T J h t T dT p h dh S h t hf t dt
The condition for the epidemic decline is then: Here we propose using this heterogeneous-model approach for the assessment of COVID-19 population immunity level. Italy is taken as an example.

Results
Since the risk of COVID-19 is highly dependent on age 13,14 , we considered an epidemic model where the individual risk of infection was a function of age. At the initial stage of an epidemic the proportion of infected persons is low. Therefore, individual risk, h, is proportional to the number of infections in each age group, and the distribution of h can be estimated from the distribution of morbidity by age.  Table 1 presents this assessment for the data from Italy. We calculated h as being proportional to morbidity; the average h for the entire population is 1. It can be seen from the last column of the table that people over 80 have the risk that is 2.5 times higher than the average, while the risk in children and adolescents is 15 times lower than the average.
Next, for the point where the epidemic starts to decline, we can assess the proportions of immune subpopulations by age using the assumptions that

Ft
. These results are presented in Table 2. Notably, the distribution is highly nonuniform, and the average proportion of people with immunity is approximately 43% instead of 60%, as estimated with the homogeneous model.

Discussion
Here we looked into the issue of assessing the level of collective immunity [16][17][18][19] , which would be safe for lifting anti-epidemic restrictions. We assumed that the basic reproductive number would be approximately 2.5 without restrictions [20][21][22] . Our assumptions were based on a heterogeneous model which, unlike the commonly used homogeneous models, accounted for the age-related differences in infection risk. We used the COVID-19 data from Italy to generate the estimates. The heterogeneous model yielded the minimum level of 43% for collective immunity to stop the epidemic, which is different from the 60% level predicted by a homogeneous model. This difference is related to the highly uneven morbidity across the age groups, with the elderly being the most affected. The estimated 43% correspond to all forms of the infectious process development, including both symptomatic and asymptomatic cases. Given the estimated proportion of manifested cases of 50% of the total, we estimate the minimum proportion of . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 25, 2020. ; https://doi.org/10.1101/2020.05.24.20112045 doi: medRxiv preprint Italians who have to suffer coronavirus infection before the nation can return to the normal lifestyle as 20%.
For a better assessment of the epidemic dynamics, additional sources of heterogeneity should be considered 23,24 , including heterogeneity by social and professional activity, heterogeneity associated with sex and heterogeneity associated with individual resistance to the pathogen. These sources of heterogeneity could lower the minimum collective immunity even further. For example, for Italy this number could be lower than 20%.