Forecasting the impact of the first wave of the COVID-19 pandemic on hospital demand and deaths for the USA and European Economic Area countries

Summary Background: Hospitals need to plan for the surge in demand in each state or region in the United States and the European Economic Area (EEA) due to the COVID-19 pandemic. Planners need forecasts of the most likely trajectory in the coming weeks and will want to plan for the higher values in the range of those forecasts. To date, forecasts of what is most likely to occur in the weeks ahead are not available for states in the USA or for all countries in the EEA. Methods: This study used data on confirmed COVID-19 deaths by day from local and national government websites and WHO. Data on hospital capacity and utilisation and observed COVID-19 utilisation data from select locations were obtained from publicly available sources and direct contributions of data from select local governments. We develop a mixed effects non-linear regression framework to estimate the trajectory of the cumulative and daily death rate as a function of the implementation of social distancing measures, supported by additional evidence from mobile phone data. An extended mixture model was used in data rich settings to capture asymmetric daily death patterns. Health service needs were forecast using a micro-simulation model that estimates hospital admissions, ICU admissions, length of stay, and ventilator need using available data on clinical practices in COVID-19 patients. We assume that those jurisdictions that have not implemented school closures, non-essential business closures, and stay at home orders will do so within twenty-one days. Findings: Compared to licensed capacity and average annual occupancy rates, excess demand in the USA from COVID-19 at the estimated peak of the epidemic (the end of the second week of April) is predicted to be 9,079 (95% UI 253-61,937) total beds and 9,356 (3,526-29,714) ICU beds. At the peak of the epidemic, ventilator use is predicted to be 16,545 (8,083-41,991). The corresponding numbers for EEA countries are 120,080 (119,183-121,107), 32,291 (32,157-32,425) and 28,973 (28,868-29,085) at a peak of April 6. The date of peak daily deaths varies from March 30 through May 12 by state in the USA and March 27 through May 4 by country in the EEA. We estimate that through the end of July, there will be 60,308 (34,063-140,381) deaths from COVID-19 in the USA and 143,088 (101,131-253,163) deaths in the EEA. Deaths from COVID-19 are estimated to drop below 0.3 per million between May 4 and June 29 by state in the USA and between May 4 and July 13 by country in the EEA. Timing of the peak need for hospital resource requirements varies considerably across states in the USA and across regions of Europe. Interpretation: In addition to a large number of deaths from COVID-19, the epidemic will place a load on health system resources well beyond the current capacity of hospitals in the USA and EEA to manage, especially for ICU care and ventilator use. These estimates can help inform the development and implementation of strategies to mitigate this gap, including reducing non-COVID-19 demand for services and temporarily increasing system capacity. The estimated excess demand on hospital systems is predicated on the enactment of social distancing measures within three weeks in all locations that have not done so already and maintenance of these measures throughout the epidemic, emphasising the importance of implementing, enforcing, and maintaining these measures to mitigate hospital system overload and prevent deaths.

Findings: Compared to licensed capacity and average annual occupancy rates, excess demand in 24 the USA from COVID-19 at the estimated peak of the epidemic (the end of the second week of 25 April) is predicted to be 9,079 (95% UI 253-61,937) total beds and 9,356 (3,526-29,714) ICU 26 beds. At the peak of the epidemic, ventilator use is predicted to be 16,545 (8,083-41,991). The 27 corresponding numbers for EEA countries are 120,080 (119,183-121,107), 32,291 (32,157-28 32,425) and 28,973 (28,868-29,085) at a peak of April 6. The date of peak daily deaths varies 29 from March 30 through May 12 by state in the USA and March 27 through May 4 by country in 30 the EEA. We estimate that through the end of July, there will be 60,308 (34,063-140,381) deaths 31 from COVID-19 in the USA and 143,088 (101,163) deaths in the EEA. Deaths from 32 COVID-19 are estimated to drop below 0.3 per million between May 4 and June 29 by state in 33 the USA and between May 4 and July 13 by country in the EEA. Timing of the peak need for 34 hospital resource requirements varies considerably across states in the USA and across regions of 35 Europe. 36 Interpretation: In addition to a large number of deaths from COVID-19, the epidemic will place 37 a load on health system resources well beyond the current capacity of hospitals in the USA and 38 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10.1101/2020  Netherlands, Germany, Belgium, Canada, and Switzerland. COVID-19 is not only causing 52 mortality but is also putting considerable stress on health systems, with large case numbers and 53 many patients needing critical care including mechanical ventilation. Estimates of the potential 54 magnitude of COVID-19 patient volume -particularly at the local peak of the epidemic -are 55 urgently needed for USA and European hospitals still early in the epidemic to effectively manage 56 the rising case load and provide the highest quality of care possible. 57 COVID-19 scenarios and forecasts have largely been based on mathematical compartmental 58 models that capture the probability of moving between susceptible, exposed, and infected states, 59 and then to a recovered state or death (SEIR models). Many SEIR or SIR models have been 60 published or posted online. [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20] In general, these models assume random mixing between all 61 individuals in a given population. While results of these models are sensitive to starting 62 assumptions and thus differ between models considerably, they generally suggest that given 63 current estimates of the basic reproductive rate (the number of cases caused by each case in a 64 susceptible population), 25% to 90% of the population could eventually become infected unless 65 mitigation measures are put in place and maintained. 6,20 Based on reported case-fatality rates, 66 these projections imply that there would be millions of deaths in the USA and Europe due to 67 . Individual behavioural responses and government-mandated social distancing 68 (school closures, non-essential service closures, and shelter-in-place orders), however, can 69 dramatically influence the course of the epidemic. As of April 14, 2020, for Wuhan City in 70 China -and also for at least 12 additional regions in Italy (Liguria,Lombardia,71 Marche, Lazio, Campania), Spain (Community of Madrid,Castile and Leon,Catalonia,72 Navarre), and the USA (King County, Snohomish County) -strict social distancing has led to 73 the peak of the first wave of the epidemic, implying that the effective reproduction number 74 (R effective ) has dropped below unity in these settings. Planning tools based on SEIR models 75 provide high-level information across populations. Few of these planning models have forecasted 76 peaks in deaths or cases and subsequent declines. Using reported case numbers and models based 77 on those for health service planning is also not ideal because of widely varying COVID-19 78 testing rates and strategies. For example, countries such as Germany, Iceland, and South Korea 79 have undertaken widespread testing, while in the USA and elsewhere, limited test availability 80 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10.1101/2020 3 has led to largely restricting testing, particularly early in the epidemic, to those with more severe 81 disease or those who are at risk of serious complications. 82 An alternative strategy is to focus on modelling the empirically observed COVID-19 population  83  death rate curves, which directly reflect both the transmission of the virus and the infection-84 fatality rates in each specific community. Deaths are likely more accurately reported than cases 85 in settings with limited testing capacity, where tests are usually prioritised for the more severely 86 ill patients. Hospital service need is likely to be highly correlated with deaths, given predictable 87 disease progression probabilities by age for severe cases. In this study, we use statistical 88 modelling to implement this approach and derive state-specific and country-specific forecasts 89 with uncertainty for deaths and for health service resource needs and compare these to available 90 resources in the USA and countries in the European Economic Area (EEA). This model is 91 regularly updated to incorporate new data for the location of interest as well as data from other 92 locations. 93

94
The modelling approach in this study is divided into four components: (i) identification and 95 processing of COVID-19 data; (ii) statistical model estimation for population death rates as a 96 function of time since the death rate exceeds a threshold in a location; (iii) predicting time to 97 exceed a given population death threshold in locations early in the pandemic; and (iv) modelling 98 health service utilisation as a function of deaths. Additional information on the determination of 99 hospital resource utilisation and capacity is provided in Appendix A; details on curve fitting 100 methods, quantification of uncertainty, and a full specification of the statistical model are 101 available in Appendix B. This study complies with the Guidelines for Accurate and Transparent 102 Health Estimates Reporting (GATHER) statement. 21 103 Data identification and processing 104 Local government, national government, and WHO websites, and third-party aggregators 22-26 105 were used to identify data on confirmed COVID-19 deaths by day of death at the first 106 administrative level (state or province, hereafter "admin 1"). Data on licensed bed and ICU 107 capacity and average annual utilisation by location were obtained from a variety of sources for 108 most countries to estimate baseline capacities; observed COVID-19 utilisation data were 109 obtained for a range of countries and USA states providing information on inpatient and ICU use 110 or were imputed from available resources (Appendix A). Other parameters were sourced from 111 the scientific literature and an analysis of available patient-level data. Age-specific data on the 112 relative population death rate by age are available from China, 28 Italy, 29 South Korea, 30 the 113 USA, 31,32 Netherlands, 33 Sweden, 34 and Germany 23 and show a strong relationship with age 114 (Figure 1). 115 Using the average observed relationship between the population death rate and age, data from 116 different locations can be standardised to the age structure using indirect standardisation 117 (Appendix B). For the estimation of statistical models for the population death rate, only admin 1 118 locations with an observed death rate greater than 0.31 per million (exp(-15)) were used. This 119 threshold was selected by testing which threshold minimised the variance of the slope of the 120 death rate across locations in subsequent days. 121 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 26, 2020. where the function Ψ is the Gaussian error function (written explicitly above), p controls the 153 maximum cumulative death rate at each location, t is the time since death rate exceeded exp (-154 15), ß (beta) is a location-specific inflection point (time at which rate of increase of the daily 155 death rate is maximum), and α (alpha) is a location-specific growth parameter. Other sigmoidal 156 functional forms (alternatives to Ψ) were considered but did not fit the data as well. Data were fit 157 to the log of the death rate in the available data, using an optimisation framework described in 158 Appendix B. For data-rich cases, we also developed linear curve fitting extension, where after a 159 Gaussian curve in daily death is obtained, we fit the data to a weighted combination (with 160 constraints on weights) of such curves propagated forward and backward in time. The resulting 161 models can capture more complex behavior in the data. 162 el s e of es, eir 1 fit . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10.1101/2020 5 An ensemble of three models was used to produce the estimates. In all models, we parametrised 163 the time-axis shift parameter beta to depend on a covariate based on time from when the initial 164 ln(death rate) exceeds exp (-15)  Catalonia, Navarre), and the USA (King County, Snohomish County) -we fit mixed effects 172 models to get the mean and variance of the relationship between the social distancing covariates 173 and the peak time, and used this information to build priors for location-specific estimates. 174 We use hospitalization data to generate additional short-term predicted deaths (pseudo-data). On 175 average, the time between hospitalization and death is 8 days. Using location-specific 176 hospitalization data which has more than 10 deaths, we estimate the ratio of cumulative deaths to 177 cumulative hospitalizations up to 8 days in the past. We use this ratio to generate pseudo-data for 178 8 days, and incorporate this pseudo-data into the CurveFit model. Details are given in Section 11 179 of Appendix B. 180 For locations with fewer than 18 days, we use the following analysis. For each type of model 181 (based on definition of the covariate), we considered both "short-range" and "long-range" 182 variants, to explain existing data and forecast long-term trends, respectively. In the former case, 183 covariate multipliers could deviate from those estimated using peaked locations, while in the 184 latter, the joint model fit from peaked locations had a larger impact on the final covariate 185 multiplier. The two remaining parameters (not modelled using covariates) were allowed to vary 186 among locations to fit location-specific data. Uncertainty for every model was obtained using the 187 predictive validity framework that analyses errors in predicting out-of-sample observations. 188 Using these methods, we obtain model realisations using draws, for both short-and long-term 189 models across the forecast horizon. We then obtain forecasts that linearly interpolate between 190 short-term and long-term models, with next days closely following short-term models and long-191 term forecasts following long-term models. Finally, we ensemble these draws across the model 192 types (based on the definition of the social distancing covariate). 193 For locations with 18 or more days, we first fit a long-term model, borrowing strength from 194 peaked locations and obtaining location-specific representative daily deaths Gaussian curves. We 195 then fit a linear combination of 13 of the inferred Gaussian curves from the long-term model, 196 placed two days apart (12 days back from the inferred peak to 12 days forward of the inferred 197 peak). We then ensemble across draws for different model types. See Appendix B (section 11) 198 for full details. 199 The dataset age-standardised to the age-structure of California is shown in Figure 2. 200 Time to threshold death rate 201 All states except Wyoming have deaths greater than 0.31 per million (e-15) and more than 2 202 deaths and were included in the model estimation along with data on 66 other admin 1 locations. 203 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10. 1101/2020 6 For other USA states or locations in the EEA, we estimated the expected time from the current 204 case count to reach the threshold level for the population death rate model. Using the observed 205 distribution of the time from each level of case count to the threshold death rate for all admin 1 206 locations with data, we estimated this distribution. We used the mean and standard deviation of 207 days from a given case count to the future threshold death rate to develop the probability 208 distribution for the day each state will cross over the threshold death rate, and then we applied 209 the death rate epidemic curve after crossing the threshold. 210 Hospital service utilisation microsimulation model

211
From the projected death rates, we estimated hospital service utilisation using an individual-level 212 microsimulation model -additional details are provided in Appendix A. We simulated deaths by 213 age using the average age pattern ( Figure 1). For each simulated death, we estimated the date of 214 admission using the median length of stay for deaths from available data (six days). Simulated 215 individuals requiring admission who were discharged alive were generated using the location-216 specific ratios of admissions to deaths; where location-specific ratios were not available we used 217 the EEA pooled estimate for other EEA countries and the USA pooled estimate for other USA 218 states. An age pattern of the ratio was based on available data (Appendix A). The age-specific 219 fraction of admissions requiring ICU care was based on data from the USA. The fraction of ICU 220 admissions requiring invasive ventilation was estimated as 85%. To determine daily bed and ICU 221 occupancy and ventilator use, we applied median lengths of stay of eight days for those not 222 requiring ICU care and discharged alive and 20 days for those admissions with ICU care, with 13 223 of those days in the ICU. 39 224 Role of the funding source

225
The funders of the study had no role in study design, data collection, data analysis, data 226 interpretation, or writing of the report. The authors had access to the data in the study and the 227 final responsibility to submit the paper. 228

229
By aggregating forecasts across location, we determined the overall trajectory of expected 230 health-care needs in different categories and deaths, as shown in Figure 3 for the USA (Panel A) 231 and for EEA countries (Panel B). These figures highlight the earlier beginning of the epidemic in 232 EEA countries compared to the USA. The USA projected peak was reached on April 15 with 233 almost 3,500 deaths daily. In EEA the peak was on April 6 with more than 4,000 deaths daily but 234 with a flatter peak, reflecting the considerable variability in the timing of the epidemic by 235 country. following suit by the beginning of April. Other countries such as the UK, Germany, and Sweden 243 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10. 1101/2020 7 are at the peak or are approaching the peak. In the USA, states with earlier peaks include 244 Washington, Nevada, Arizona, Montana and Florida. States at the peak or just approaching the 245 peak include Texas, California and parts of New England. States in the middle of the country, 246 including North Dakota, South Dakota, Iowa and Wyoming are expected to peak later. 247 Figure 5 shows total excess demand for the USA (panel A) and EEA countries (panel B) overall. 248 In the USA, peak excess demand for hospitalisation above usual capacity was estimated as 9,079 249 (95% UI 253-61,937); ICU bed excess demand was 9,356 (3,526-29,714). We estimated that 250 EEA countries experienced a peak excess demand above usual capacity for total beds of 28,270 251 (0 to 126,788) at peak; the ICU bed shortfall was 16,090 (15,211). Excess demand is 252 concentrated in particular countries and USA states as shown in Figure 6, which shows the 253 percentage excess demand for ICU beds by location: in the USA (panel A), excess demand for 254 ICU beds is concentrated in New York, New Jersey, Connecticut, Wyoming, Michigan, Rhode 255 Island, and Massachusetts; in the EEA (panel B), ICU excess demand above usual capacity is 256 particularly high in Sweden, Spain, Northern Ireland, Italy, France, and Belgium. We have not 257 been able to estimate current ventilator capacity; however, the number of ventilators per person 258 implied by the peak (Figure 3) also suggests potentially large gaps in availability of ventilators. 259 Figure 7 shows the expected cumulative death numbers with 95% uncertainty intervals for the 260 USA (Panel A) and EEA (Panel B). In the USA, the average forecast suggests 60,308 deaths, but 261 the range is large, from 34,063 to 140,381 deaths. The figure shows that uncertainty widens 262 markedly as the peak of the epidemic approaches, given that the exact timing of the peak is 263 uncertain. Massachusetts, Wyoming, Louisiana, and Michigan. 274 Figure 9 shows the date by location by which projected daily deaths drop below 0.3 per million. 275 As expected, there is a strong correlation between the timing of the peak daily death and when 276 the daily death rate will drop below this threshold. In Europe, countries where this will happen 277 later include the UK, Norway, Denmark, Sweden and the Netherlands. In the USA, states that 278 will not cross this threshold until the end of May include South Dakota, North Dakota, Iowa, 279 Oklahoma, Arkansas and Utah. 280 Results for each location are accessible through a visualisation tool at 281 http://covid19.healthdata.org/projections -the estimates presented in this tool will be continually 282 updated as new data are incorporated and ultimately will supersede the results in this paper. 283 Summary information on cumulative deaths, the date of peak demand, the peak demand, peak 284 excess demand, and aggregate demand are provided for each location in Table 1. 285 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. This study has generated estimates of predicted health service utilisation and deaths due to 287 COVID-19 by day through the end of July for all USA states and EEA countries, assuming that 288 social distancing efforts will continue until deaths reach a very low level. The analysis shows 289 large gaps between need for hospital services and usual capacity, especially for inpatient and 290 ICU beds. A similar or perhaps even greater gap for ventilators is also likely, but detailed state or 291 country data on ventilator capacity are not available to directly estimate that gap. Uncertainty in 292 the time course of the epidemic, its duration, and the peak of utilisation and deaths is large, 293 particularly for when locations are early in the epidemic and where there are few deaths. Given 294 this, it is critical to update these projections as the pandemic progresses and new data are 295 collected. Uncertainty will also be reduced as we gain more knowledge about the epidemic peak 296 and subsequent decline in daily deaths across more than 13 locations. A critical aspect to the size 297 of the peak is when aggressive measures for social distancing are implemented in each state, 298 region, or country and for how long they are maintained. Delays in implementing government-299 mandated social distancing and relaxing policies will have an important effect on the resource 300 gaps that health systems will be required to manage. 301 Our estimates of excess demand show that hospital systems have already or will face difficult 302 choices to continue providing high-quality care to their patients in need. This model was first 303 developed for use by the UW Medicine system in Washington state, and the practical experience 304 of that system provides insight into how it has been useful for planning purposes. From the 305 perspective of planning for the UW Medicine system, these projections immediately made 306 apparent the need to rapidly build available capacity. Strategies to do so included suspending 307 elective and non-urgent surgeries and procedures, while supporting surge planning efforts and 308 reconfiguration of medical/surgical and ICU beds across the system. These targets also supported 309 a proactive discussion regarding the potential shift from current standards of care to crisis 310 standards of care, with the goal to do the most good for the greatest number in the setting of 311 limited resources. 312 There are a variety of options available to deal with the situation, some of which have already 313 been implemented or are being implemented. One option is to reduce non-COVID-19 patient 314 use. In the USA and in many EEA countries, local, state, or national governments have cancelled 315 elective procedures 40-45 and many, but not all, hospitals have complied. This decision has 316 significant financial implications for USA health systems, however, as elective procedures are a 317 major source of revenue. 46 Also, aggressive social distancing policies reduce not only the 318 transmission of COVID-19 but will likely have the added benefit of reducing health-care 319 utilisation due to other causes such as injuries. 47 Reducing non-COVID-19 demand alone will 320 not be sufficient, and strategies to increase capacity are also clearly needed. This includes setting 321 up additional beds by repurposing unused operating rooms, pre-and post-recovery rooms, 322 procedural areas, medical and nursing staff quarters, and hallways. 323 Currently, one of the largest constraints on effective care may be the lack of ventilators. One 324 supplement to ventilator capacity is using anesthesia machines freed up by deferring or 325 cancelling elective surgeries. Other options go beyond the capacity or control of specific 326 hospitals. The use of mobile military resources has the potential to address some capacity 327 limitations, particularly in the USA given the differently timed epidemics across states. Other 328 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. continuing to revise the model as new data are available, providing an updated forecast for health 365 service providers, governments, and the public. In some regions that have peaked, such as 366 regions in Italy like Liguria, or New York, the duration of the peak is much longer than in other 367 places such as Madrid. The mixture model we use accommodates this longer peak but it remains 368 unclear why some communities have the prolonged peak and others do not. The prolonged peak 369 leads to substantially increased total mortality. There is also marked variation across locations in 370 how steeply the epidemic curve rises, captured by the alpha parameter in our model. 371 Understanding why some locations have an epidemic like New York and others like Washington 372 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10.1101/2020    is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10.1101/2020  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10.1101/2020  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020April 26, . . https://doi.org/10.1101April 26, /2020      qqqqqqqqqq qqqqqqq q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq qq qq qqq q qqqq qqqqqq    Table 1. Summary information on deaths, peak demand, peak excess demand, and aggregate demand, by location . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 26, 2020. . https://doi.org/10. 1101/2020