Variation among states in rate of coronavirus spread

The corona virus, COVID-19, has been spreading rapidly across the USA since early March, but at a decreasing rate, where the rate r is defined as the exponential increase. I modeled the way the rate of increase y = ln(e^r - 1) has declined through time in each of the 51 states with the goal of determining how quickly the rate has declined, whether the decline has changed, and whether states differ. A piecewise linear regression was used, with a single break point. This model can identify whether there was a change in the rate of decline, when the change happened, and which states have shown the greatest improvement in reducing the spread of COVID-19. The piecewise model identified a significant breakpoint on 24 Mar for all states combined, and all states had nearly the same breakpoint. Prior to 24 Mar, the average change in y was -0.013 per day, meaning a reduction in the rate of spread from 23.5% per day to 19.5% per day; after 24 Mar, the average change in y was -0.070 per day, a reduction from 19.5% per day to 7.5% per day. Prior to 24 Mar there was no significant variation among states in the decline in y, but after 24 Mar there was substantial variation. Montana, Idaho, and Vermont showed the greatest improvement, while Nebraska, South Dakota, and Iowa the least. The improvement as measured by the reduction after 24 Mar did not correlate with case density in a state, nor state population. The next question is whether it correlates with differences among states in the health measures taken to combat the spread.


21
The number of COVID-19 infections has increased steadily since early March in every state in the USA. The density of infections (cases per capita) varies substantially among states, but the more 23 pertinent interest is the rate at which the number of infections increases. Various public health 24 measures have been taken to slow that increase, and eventually it will be necessary to assess how 25 well different measures worked. Such assessments have been done to test for the effect of Daily counts of the cumulative total of COVID-19 cases per state was collected from weather.com 35 (weather.com 2020). There is a stable url for each state that I could curl (the unix function to 36 capture web text) with an automated script. Each day's web presentation was complete, including 37 daily counts back to mid-February. A url for all 51 states (with DC) had to be copied and saved, 38 but once stored, capturing all states' information was a fully automated process. The text came as 39 a long html script with javascript data arrays giving numbers buried within. I wrote C++ program 40 to extract those arrays and move them into tables in the R programming language. Many 41 individual records were checked to confirm the data were captured correctly. Counts were 42 cross-checked against data from a New York Times Github site (NY Times 2020) and were 43 essentially (but not exactly) identical. Analyses were done on case records through 21 Apr 2020, have used the number of active cases, substracting also all recoveries, but that information was not 53 available.

54
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 3, 2020. . https://doi.org/10.1101/2020

55
Let the cumulative number of cases on day t be N t , so the rate constant of population growth r is 56 defined from 57 ln N t = r ln N t−1 + ln N 0 .
If r is constant through time, growth is exponential but I do not make this assumption. The and lnC t − ln N t = ln(e r − 1).
Define y = ln(e r − 1) as the response variable in a model of rate of spread of COVID-19. It has a 60 roughly Gaussian distribution, and it has been changing linearly through time over the past 6 61 weeks in the US. Note that if r ∼ 0.2 or less, e r ∼ 1 + r so y ∼ r, and r is the fractional daily 62 increase. When r >> 0.2, y increases monotonically with r but is a better choice due to its 63 symmetrical distribution. following analysis is all about that second derivative, or how the rate of increase changes. If 71 growth were exponential, the rate would not change, ie the second derivative of N t would be zero.

72
As everyone watching knows, the rate of spread of COVID-19 has been declining, and the model I 73 create here fits that decline as a linear response to time.

74
The piecewise component of the regression adds the feature that the decline in the rate of spread data, and the model will report a rigorous test about whether or not there is a break, ie whether the 84 slope changes.

85
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020. . https://doi.org/10.1101/2020.04.27.20081752 doi: medRxiv preprint I used a multi-level hierarchical model, also known as a mixed-effects model, in which the 51 86 states were random effects. This produces an estimate for how the rate of COVID-19 has changed 87 through time in every state, but has the benefit of simultaneously using all the states together. This

97
Increase in total cases. The number of COVID-19 cases increased steadily but at a consistently 98 declining rate (Fig.1). That is, growth was less than exponential. 19.5 pct. to 7.5 pct. per day in the two weeks after.

105
There was no variation across states in the day on which the slope changed: in all 51 states it was 106 either 24 Mar and 25 Mar, and credible for all 51 states overlapped. Likewise, the slope prior to 107 the break did not vary significantly among states; all 51 credible intervals overlapped, and the 108 slope was always between −0.16 and −.010.

109
There was, however, statistically significant variaton among states in the slope after 24 Mar.  There was no correlation between the slope prior to 24 Mar and the slope after that day 118 (Supplemental Fig. S1). This is expected given the lack of variation prior to 24 Mar. 119 Improvement in phase 2 and case density. There was no correlation between the improvement 120 in the rate of spread, as measured by the slope after 24 Mar in each state, and the case density 121 (cases per million) on 24 Mar (Fig. 3). The slope was more negative (better improvement) in 122 states with a higher density of cases, but the regression was not significant (Fig. 3).

123
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020. . https://doi.org/10. 1101/2020 Improvement in phase 2 and population density. There was no correlation between the 124 improvement in the rate of spread, as measured by the slope after 24 Mar in each state, and the 125 population size of a state. The slope was slightly positive but non-significant (p = 0.63, r 2 ∼ .01).

126
Alternative models. Three-phase piecewise regression identified one break matching the sharp 127 shift of the two-phase model, plus a later break that was accompanied by no change in slope.

148
Determining whether state differences in reducing the rate of COVID-19 spread can be attributed 149 to control measures requires information I have not gathered yet. In addition, I suggest that a more 150 precise answer might come from county-level variation, and I intend to repeat this model using 151 county case records. In the meantime, the estimate of improvement shown in Figure 3 and Table   152 1, and available for download via the Supplement, was created without a priori knowledge about    is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020. .

Figure 2
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020. as 24 Mar, and slopes are given before and after that break (with 95% credible intervals in 210 parentheses. The slopes and the break were estimate using piecewise regression.

211
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020. . https://doi.org/10.1101/2020.04.27.20081752 doi: medRxiv preprint Table 1 State Slope before 24 Mar (95% CI) Slope after 24 Mar (95% CI)  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020. . https://doi.org/10. 1101/2020   . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020.  Figure S1. Improvement in rate at which COVID-19 spread in each state prior to 24 Mar 215 (horizontal axis) versus after 24 Mar (vertical axis). Thin vertical bars show 95 pct. credible 216 intervals on the second slope estimates; pairs of states whose vertical bars do not overlap are 217 inferred to be statistically distinct. The horizontal axis has a much narrower range, since states 218 barely differed; horizontal credible bars are omitted because every one would extend outside the 219 range of the figure. There was not a significant correlation between the two slopes.

224
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 3, 2020. . https://doi.org/10. 1101/2020.04.27.20081752 doi: medRxiv preprint Figure S1 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 3, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.