Detailed reconstruction of the Iranian COVID-19 epidemic reveals high attack rates of SARS-CoV-2 in several provinces

Since the first cases of COVID-19 were reported in Qom, Iran, almost 19 months ago, the transmission dynamics across the country and the health burden of COVID-19 has remained largely unknown due to the scarcity of epidemiological analyses and lack of provincial data on the number of COVID-19 cases and deaths. For the first time, we reconstruct the epidemic trajectory across the country and assess the level of under-reporting in infections and deaths using province-level age-stratified weekly all-cause mortality data. Our estimates suggest that as of 2021-09-17, only 48% (95% confidence interval 43-55%) of COVID-19 deaths in Iran have been reported. We find that in the most affected provinces such as Qazvin, Qom, and East Azerbaijan approximately 0.4% of the population have died of COVID-19 so far. We also find significant heterogeneity in the estimated attack rates across the country with 11 provinces reaching close to or higher than 100% attack rates. Despite a relatively young age structure in Iran, our analysis reveals that the infection fatality rate gradually increased over time in several provinces and reached levels that are comparable some of the high-income countries with a larger percentage of older adults, suggesting that limited access to medical services, coupled with undercounting of COVID-19-related deaths, can have a significant impact on COVID-19 fatalities. These results also show that despite several waves of infection and high attack rates in many provinces with largely unmitigated epidemics, herd immunity through natural infection has not been achieved.


Introduction
Iran was among the first countries outside mainland China to report a large outbreak of SARS-CoV-2 in February/March 2020 and also acted as a major source for its spread in several countries in the Middle East and elsewhere [1]. It was also one of the first countries to experience a second wave of infection after the relaxation of non-pharmaceutical interventions (NPIs) in July 2020 [2,3]. Since then, the country experienced three additional waves, two of which were with the Alpha and Delta variants of SARS-CoV-2, in May and September 2021. Despite a recent increase in the number of daily administered vaccine doses, Iran had a very slow start on its immunisation programme with only ~3% of the population being fully vaccinated by the end of July 2021, during the surge in cases with the Delta variant [4].
Assessing the true burden of COVID-19 requires a comprehensive surveillance system to monitor outbreaks and a large capacity for diagnostic testing. However, this is particularly challenging for many low-and middle-income countries due to limited diagnostic capacities and access to healthcare [5,6]. An alternative approach to measuring the burden of COVID-19 is to monitor changes in all-cause mortality trends with respect to baseline levels based on historic trends (i.e., measuring excess mortality). While excess mortality during the current pandemic may not exactly match the true number of COVID-19-related deaths due to various reasons such as disruptions in treatment for other fatal diseases and socio-economic disparities in health care in different regions of a country, several works have shown that a significant portion of excess deaths in many countries are directly attributable to COVID-19 [7][8][9][10][11][12]. This enables us to use excess mortality as a proxy for estimating the number of infections and fatalities from COVID-19 and correct for underreporting of cases and deaths that may arise due to limited testing from suspected cases and uncertainty in the number of fatalities attributable to COVID-19 [13,14]. Further under-estimation of the true burden of COVID-19 is possible due to under-registration of all-cause mortality data from civil registration and vital statistics systems, particularly in low-and middle-income countries [15].
Since Iran's Ministry of Health and Medical Education (MoHME) stopped releasing province-level data on the number of confirmed COVID-19 cases and deaths from 2020-03-22, the transmission dynamics across the country has remained largely unknown until now. Using the newly updated province-level age-stratified weekly all-cause mortality data from the National Organization for Civil Registration (NOCR) of Iran, we provide the first detailed reconstruction of the Iranian epidemic with the aim to estimate the COVID-19 infection and fatality rates across the country.

Results and Discussion
We obtain age-stratified weekly all-cause mortality data from NOCR and apply a statistical model to calculate the weekly excess mortality for all 31 provinces and across 17 age-groups during the Iranian epidemic (see Methods section). Figure 1A shows 5 distinct peaks in excess mortality with a temporal trend which is strongly associated with the nationwide reported COVID-19 deaths over time. There were significant levels of excess mortality from the first week of February 2020 suggesting that the Iranian epidemic likely started at least a month before that. This is in agreement with previous phylogenetic and epidemiological studies suggesting the epidemic started in late December/early January [2,16]. Furthermore, during the first two waves of the Iranian epidemic, the excess mortality was roughly 2.5 times higher than reported COVID-19 fatalities. However, this dropped to 2.1 at the later stages suggesting that the testing capacity of the country to record infections and deaths gradually improved over time, but not to the extent to fully capture the majority of COVID-19-related deaths. This is also in agreement with previous studies showing elevated levels of under-reporting of COVID-19 fatalities in Iran, particularly in the early stages of the epidemic, using quarterly data from winter 2019/20 to summer 2020 [2,17]. We also find that the cumulative nationwide excess deaths by 2021-09-17 is 240,870 (95% CI: 210,560 -271,790) which is more than twice the 116,000 reported COVID-19 deaths at the time. Figure 1B shows the intensity of excess mortality as quantified by the number of excess deaths per 100,000 persons in each province over time. During the first wave in spring 2020, only a few provinces showed significant levels of excess mortality with Gilan, Qom, Mazandaran, and Golestan among the hardest-hit provinces. This is likely because these provinces usually attract a . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint large number of tourists and pilgrims around the New Year's holiday in February/March which likely led to large superspreading events and uncontrolled outbreaks due to limited NPIs before the first national lockdown on 2020-03-05 [2]. As the epidemic progressed, the timing of the peak excess mortality between different provinces were more simultaneous ( Figure 1B). Most notably, during the third wave in October/November 2020 and the fifth wave in August/September 2021 with the Delta variant, almost all provinces experienced significant number of fatalities around the same time. This indicates that the effective reproduction number, R t , across the country was so high that local measures were not enough to control the surge in cases and deaths.
We also carry out a similar analysis to estimate the temporal pattern of COVID-19 fatalities for different age-groups (Figure 2A). We find that majority of excess deaths is concentrated in the older age categories (>55) and that the third and fifth epidemic waves had the largest impact on younger age groups. Figure 2B shows the per capita excess mortality at a week with the highest excess deaths during each of the five waves across all age-groups. The per capita fatality rates were highest during the fifth wave for all age groups compared to previous waves with the exception of people aged 75 or above which is likely due to higher vaccination coverage in those age-groups. While the exact number of vaccinated individuals per age-group in Iran is not known, given that the immunisation programme for individuals aged >75 started in May 2021 (first phase priority groups for vaccination in Iran are frontline healthcare workers as well as those aged above 80; see [4] for more information), and that Iran fully vaccinated roughly 5 times the population size of individuals aged >75 by 2021-09-07, it is likely that the drop in fatality of people aged >75 during the fifth wave is due to their high vaccination coverage. Figure 2B also shows a clear loglinear relationship between per capita excess mortality and age, a signature characteristic of COVID-19 fatalities [18], indicating that the vast majority of excess deaths in all age-groups are directly related to COVID-19. Figure S1 shows a wide range of variation in the temporal pattern of per capita excess mortality per province from the week starting on 2019-12-28 to 2021-09-17. We find that this pattern is also strongly associated with the weekly confirmed and suspected COVID-19 hospital admissions . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint from 2020-02-20 to 2021-09-12 ( Figure S2). By comparing the ratio of confirmed daily COVID-19 hospital admissions to deaths at the national level, we find that there are roughly 7 confirmed hospital admissions for every confirmed death ( Figure 3A) which is in agreement with known COVID-19 fatality rates in hospitalised patients [19]. While there is wider variation at the province level, we find the same ratio for the number of weekly suspected hospital admissions to excess deaths ( Figure 3B) further indicating that the difference between excess mortality and confirmed COVID-19 deaths reflect the same gap between suspected and confirmed hospital admissions due to under-reporting of COVID-19 numbers. Figure S1 and Figure S2 also show that while some provinces such as Ardabil, Alborz, Semnan, Golestan, Hamedan, and Mazandaran experienced five distinct epidemic peaks, others like Bushehr, Qazvin, Gilan, Kohgiluyeh and Boyer Ahmad, and Hormozgan experienced sustained periods with elevated mortality rates throughout the epidemic. Furthermore, even though the excess mortality during the fifth wave with the Delta variant is on a decline in most provinces, some still exhibit sustained levels of excess mortality with Sistan and Baluchistan province showing a 3-fold increase in excess mortality compared to previous waves. We also find that majority of excess deaths occurred in provinces with greater population size such as Tehran, Razavi Khorasan, Isfahan, East Azerbaijan, and Khuzestan provinces ( Table 1).
Assuming that lack of testing is the primary reason behind the discrepancy between excess mortality and official COVID-19 deaths [7,20,21], we attribute all deaths above baseline during the pandemic to COVID-19 (hereafter called the 'standard approach'). This enables us to estimate a population fatality rate (PFR), infection fatality rate (IFR), and attack rate per age group per province ( Table 1). We find that Qazvin, Qom, and East Azerbaijan had the highest per capita mortality rate with an estimated PFR of 0.403% (95%CI: 0.280-0.526%), 0.399% (95%CI: 0.293-0.505%), and 0.391% (95%CI: 0.311-0.473%), respectively. Furthermore, we find that while the estimated IFR in most provinces during the early stages of the epidemic is close to the populationweighted estimate which is based on the province-specific demographic distributions by sex and age, the estimated IFR increases by a factor of 2-3 during the later stages (Table S1). In particular, . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; we find that Tehran and Isfahan, two of largest and most densely crowded provinces of Iran, had the highest IFR with 1.300% (95%CI: 0.883-3.230%) and 0.934% (95%CI: 0.598-3.420%), respectively, which are comparable to IFR estimates in New York city and some regions of Brazil [22,23]. This suggests population-weighted IFR estimates can be extremely biased as they may not correctly account for heterogeneities in the attack rates for different age-groups and the impact of socioeconomic disparities such as disadvantages in health system capacity which are known to influence IFR estimates, especially for lower-income areas [8,24].
Using the estimated IFRs, we calculate the overall attack rate on 2021-09-17 for each province (Figure 4). Our results are striking as they show very high attack rates across the country with ≥100% attack rates in 11 provinces. In most regions, the majority of infections took place during the second and third wave ( Table 2) The wide confidence interval on the inferred attack rates in Table 1 represent the uncertainty in the estimated excess mortality, particularly in younger age-groups. For these age-groups, we generally observe much fewer COVID-19 deaths. As a result, given a 500-1,000-fold difference between the age-stratified IFR in <30 and >80 age-groups, missing one COVID-19-related death in the younger age-groups is equivalent to missing nearly a thousand more infections in the elderly population [29]. This effect is more pronounced during the first wave where fewer deaths took . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint place in the younger age-groups, hence further pushing the upper bound for estimated IFRs upwards (see Table S1).
We also estimate attack rates using a more 'conservative approach' whereby, rather than considering only positive excess deaths to be directly attributable to COVID-19, we add all the positive and 'negative' excess deaths per age-group over time to account for potential weekly variations in excess mortality produced by periods of mortality deficit (also known as harvesting).
While Table S2 shows that the estimated attack rates in most provinces using this method are very low (with the exception of Qazvin, Ilam, and Kurdistan which still have ≥100% attack rates), they correspond to extremely high IFRs which are incompatible with COVID-19 IFR estimates across the world [30]. By comparing the conservative and standard approaches at the national level ( Figure S3), we find that while the estimated PFR for individuals aged 40 and above is similar for both approaches, the conservative approach finds fewer deaths in the younger age-groups.
This makes the inferred IFRs much higher for the conservative approach. Furthermore, the discrepancy between the estimated attack rates using the two approaches is more pronounced in provinces with higher levels of variation in baseline mortality such as Sistan and Baluchistan, Kohgiluyeh and Boyer Ahmad, South Khorasan, and Hormozgan ( Figure S4). In particular, we find that for several provinces the conservative approach may predict zero deaths in <20 age-groups (data not shown) which may further bias the IFR estimates upwards.
Prior to the emergence of the variants of concern, the basic reproduction number of the SARS- and comparative studies have shown that re-infection with SARS-CoV-2 is possible after 3-6 months, and that older age-groups have a much lower protection against reinfection [35,36].
Previous studies on human coronaviruses and SARS-CoV have also shown that protective immunity typically starts declining after 5-8 months post infection [37][38][39]. Our findings of high attack rates in several provinces show that herd immunity through natural infection has not been achieved in the population even after nearly 20 months since the start of the Iranian epidemic. This is likely to due to substantial reduction in protection against repeat infection over time either due to waning immunity, increased chance of re-infection with variants of concern, or a combination of both. We note that while systematic biases such as identifying the baseline level of all-cause mortality and overestimation of deaths attributable to COVID-19, particularly in the younger age-groups, can contribute to increased attack rates in some provinces, other factors such as heterogeneities in immune protection, limited access to health care in municipalities with low socioeconomic status, and uneven adoption of NPIs in different regions can also contribute to varied attack rates across the country.

Methods
We collect the weekly time-series data on all-cause mortality per province per age group from NOCR. We use the data from the beginning of year 1394 to the end of summer 1398 in Solar Hijri calendar (SH) (from 2015-03-15 to 2019-09-22 in Common Era) to calculate background mortality.
We exclude mortality data during autumn 2019 (from 2019-09-23 to 2019-12-21) from the calculation of background mortality as we have previously reported elevated mortality rates across several provinces, unrelated to the COVID-19 pandemic, which can bias our estimates for the expected mortality [17]. We calculate the expected mortality using a linear regression model previously developed to track excess mortality across more than 100 countries and territories around the world [7]. To estimate the IFR, we first calculate the age distribution of COVID-19 fatalities for each province based on the estimated excess mortality, D i , for a given age group, i, during the pandemic period. By weighting the known age-stratified IFR estimates for COVID-19 [29], IFR i , by the fraction of the population in each age-group, ∑ , we can account for the non-. CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint uniform attack rate per age group and find an average province-level IFR, 〈 〉, which we then use to estimate the attack rates. In other words, for a given province, the estimated IFR is given

Data Availability
All the data used for the analysis are available online on our GitHub repository (github.com/mg878/Iran_WeeklyMortality). For regular updates on excess mortality in Iran, you can visit: https://github.com/akarlinsky/world_mortality. . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021.  . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint Tables   Table 1: The overall excess mortality, PFR, estimated IFR (accounting for the age-specific excess mortality), population-weighted IFR (based on age distribution of each province), and percentage of the population ever exposed to SARS-CoV-2 in each province as of 2021-09-17.
* Based on the estimates from Ghafari et al. [17]. ** Year 1394 Solar Hijri is excluded from the calculation of background mortality for this province due to a record-high excess mortality [17].
. CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; Table 2: The overall attack rate (in percentage) by the end of May 2020 (wave 1), August 2020 (wave 2), January 2021 (wave 3), June 2021 (wave 4), and the week ending on 17 September 2021 (wave 5).
. CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021.     . CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint Table S2: The overall excess mortality, PFR, estimated IFR (accounting for the age-specific excess mortality), population-weighted IFR (based on age distribution of each province), and percentage of the population ever exposed to SARS-CoV-2 in each province as of 2021-09-17 using the conservative approach to calculating attack rates (i.e., counting deaths above and below baseline mortality).
. CC-BY 4.0 International license It is made available under a who has granted medRxiv a license to display the preprint in perpetuity.
is the author/funder, (which was not certified by peer review) The copyright holder for this preprint this version posted October 7, 2021. ; https://doi.org/10.1101/2021.10.04.21264540 doi: medRxiv preprint Figure S1: One-month average of weekly excess mortality per 100,000 persons per province over time. Shaded area shows the 95% confidence intervals. Figure S2: One-month average of weekly excess mortality (blue), confirmed (orange) and suspected (blue) COVID-19 weekly hospital admissions per 100,000 persons per province over time. Figure S3: Nationwide % PFR per age-group using the standard (blue) and conservative (black) approaches to calculating the attack rates. Figure S4: Weekly excess mortality over time using the standard (blue) and conservative (black) approaches to calculating the attack rates.