## ABSTRACT

We apply the logistic model to the four waves of COVID-19 taking place in South Africa over the period 3 January 2020 through 14 January 2022. We show that this model provides an excellent fit to the time history of three of the four waves. We then derive a theoretical correlation between the growth rate of each wave and its duration, and demonstrate that it is well obeyed by the South African data.

We then turn to the data for the United States. As shown by Roberts (2020a, 2020b), the logistic model provides only a marginal fit to the early data. Here we break the data into six “waves,” and treat each one separately. Five of the six can be analyzed, and we present full results. We then ask if these data provide a way to predict the length of the ongoing Omicron wave in the US (commonly called “wave 4,” but the sixth wave as we have broken the data up). Comparison of these data to those from South Africa, and internal evaluation of the US data, suggest that this current wave will peak about 18 January 2022, and will be substantially over by about 11 February 2022. The total number of infected persons by the time that the Omicron wave is completely over is projected be between 22 and 24 million.

## 1. Introduction

Since the spring of 2020 a pandemic of infection of a novel coronavirus (SARS-CoV-2) has overspread the world. Epidemiologists have struggled to describe adequately this pandemic and to predict its future course. This is a complex undertaking, involving the mathematics of epidemiology and a huge amount of uncertain data that serve as input to the models. In this paper we examine the ability of a simple epidemiological model, the *logistic model*, to describe the course of the pandemic in the Republic of South Africa and the United States. After showing that it is adequate to the task for South Africa, we derive a theoretical correlation between the growth rate of each wave of the pandemic and its duration, and show that it is well obeyed for the South Africa data. Finally, we discuss the use of this correlation to predict the course of the fourth (Omicron driven) wave in the United States, and show that it should die out very soon.

The data are from the World Health Organization (WHO 2022).

## 2. The Logistic Model

### 2.1. Derivation

The exposition in this section is from Roberts (2020a).

A simple model for the evolution of a pandemic is based on the logistic differential equation. This describes a pandemic that begins with a small number *f*_{0} of infected individuals, and subsequently spreads through a population. The motivation for this model is as follows. If the population of infected individuals as a function of time is *f* (*t*), simple exponential growth with growth rate *r* is determined by the differential equation
with the growing exponential solution
where *f*_{0}= *f* (0). This is what happens with an infinite pool of subjects. However, for a finite pool of subjects, as the population of infected individuals grows the number of subjects available to be infected gets smaller. This is taken into account by modifying the exponential differential equation to become
where *K* is the total available pool of individuals. The solution of this equation^{2} is
which satisfies the required limits *f* (0) = *f*_{0} and *f* (∞) = *K*. The time course described by Equation 2 is the familiar “S curve” used to describe bacterial growth and other phenomena (see Figure 3).

An analysis of the total number of cases as a function of time *f* (*t*) is just one way to compare model and data. Instead we can examine the number of new cases per day as a function of time; this is the time derivative of *f* (*t*).^{3} This is easily found by substituting Equation 2 into Equation 1,
In either case, for the logistic model the three parameters to be adjusted are *f*_{0}, *K*, and *r*. For waves with a substantial ongoing number of daily infections coming in, the model is augmented by adding a baseline level *N*_{0} that is also solved for from the data,

### 2.2. COVID data for the United States and the Republic of South Africa

The plots in this section show the total and daily distributions of the number of cases of COVID-19 in the United States and in South Africa (WHO 2022). Day 1 is 03 January 2020.

### 2.3. Fit to the COVID-19 Pandemic in the Republic of South Africa

When applied to the ongoing COVID-19 pandemic in the United States the logistic model does only a fair job of accounting for the actual history of the total number of infected individuals as a function of time in the first wave of COVID-19, which took place in the first half of 2020 (see Roberts 2020a). It fails even more spectacularly for later waves (Roberts 2020b).

In this section we apply the logistic model to the four waves in South Africa. Solutions were found by numerical minimization of either the sum of the squares of the differences between model and data (least squares fit, or “LSQ”), or the sum of the absolute values of the differences (“L1”). These solutions were compared to plots of the seven day moving average of the data for each wave, and discordant solutions were discarded.^{4}

### 2.4. RSA Wave 1

### 2.5. RSA Wave 2

### 2.6. RSA Wave 3

### 2.7. RSA Wave 2

## 3. Correlation of the Growth Rate and Duration for the COVID-19 Waves in the Republic of South Africa

It is natural that the growth rate and the duration of the waves of a pandemic should be correlated, higher growth rates leading to shorter durations. It can be shown (see the Appendix) that for a logistic model the daily cases distribution has a full-width-half-maximum given by
Note that this result is independent of *f*_{0} and *K*. The data and the prediction of Eq. 5 are compared in Fig. 7, where we see that the agreement is excellent. The full-width-half-maxima were estimated from curves of seven day moving averages of the daily data. The values of *r* for each wave were found by averaging those of the LSQ and L1 fits.

## 4. Six Waves in the United States

In this section we apply the logistic model to five of the waves of COVID-19 in the United States, using the techniques described above. The fourth wave (around day 450 on Fig. 1) is omitted because we found it impossible to determine its full-width-half-maximum in a reliable way.

### 4.1. USA Wave 1

### 4.2. USA Wave 2

### 4.3. USA Wave 3

### 4.4. USA Wave 5

## 5. Predictions for United States Wave 6

### 5.1. Comparison of Theory and Data

Fig. 14 shows the correlation between growth and duration of the various waves in the United States. The curve is the theoretical prediction of Eq. 5, identical to the one in Fig. 7. The agreement is not as good as for the South African data, but this so be expected because measurement of the FWHM of the data is very uncertain for two of the waves (examine Figs. 9 & 11).

### 5.2. Prediction for the Omicron Wave in the United States

#### 5.2.1. Predictions from the Data

In Figs. 15 & 16 we show the predictions for wave 6 that follow from the L1 and LSQ solutions. The two provide roughly equivalent fits to the data (see Figs. 12 & 13), so we show the predictions made from each to exhibit some of the uncertainty. Examining the plot of the moving average of the wave 6 daily cases data we find a peak at day 95 and a FWHM of 25 days. From Fig. 15 we see that the total number of cases by the end of wave 6 is predicted to be 24 million.

We rather arbitrarily define the date at which the wave will be substantially over to be at one full-width-half-maximum past the peak, when the daily number of cases will be about 11% of its maximum. Thus using these data we expect that to be about day 120, which is 10 February 2022.

#### 5.2.2. Predictions from the Theory

We can use Eqs. 8 & 10 and the values of *f*_{0}, *r*, and *K* to find the expected date of the peak of wave 6 and its FWHM. Adopting the LSQ fit to this wave (*f*_{0}= 71.5 days, *K* = 2.38 × 10^{7}, *r* = 0.133 days^{−1}) this leads to an expected day of peak daily cases at 95.6 days and an expected FWHM of 26.5 days, and thus a date of substantial diminution of day 122 (12 February 2022). For the L1 solution (*f*_{0}= 51.5 days, *K* = 2.20 × 10^{7}, *r* = 0.138 days^{−1}) these are 93.9 days and 25.6 days, leading to a date of day 120 (10 February 2022). These are in good accord with the estimates obtained above. Thus this approach suggests that wave 6 will be substantially over by day 121, or 11 February 2022.

## 6. Conclusions

In this paper we have used the logistic model for epidemics to describe the COVID-19 outbreaks in the Republic of South Africa and in the United States. We find a universal analytic relationship between the growth rates and the durations of each wave, and this is closely followed by the (very clean) data from South Africa. When applied to the United States, the data are messier, but a tentative prediction is possible for the expected duration of the current Omicron wave in the US – it should be substantially over by about day 121, or about 11 February 2022. By the full end of the Omicron wave the total number of infected persons in the US is projected to be between 22 and 24 million.

## 7. Acknowledgements

We thank Brian Boyle, Mary Roberts, and Bob Sauer for their helpful comments and suggestions.

## 9. Appendix Derivation of Equation 4

The differential distribution for the logistic function is^{5}
and is derivative is
This is zero at
and *df* / *dt* at this maximum is
Setting *df* / *dt* equal to half of this value and solving for the locations yields a complicated expression that reduces to
Note that neither *f*_{0} nor *K* enters this result.

## Footnotes

roberts{at}brandeis.edu

Substantial addition explanatory material was added and several minor corrections made. A quantitative definition of the "end" of the omicron wave was added, leading to a date of 11 February 2022. Otherwise the conclusions are unaltered.

↵

^{2}This is a version of the Bernoulli differential equation*f*^{t}=*f*(1 −*f*).↵

^{3}Fitting the daily numbers is statistically the preferred procedure as the data points are independent, unlike those for daily totals.↵

^{4}In a numerical minimization of a non-linear function of multiple variables there is always a possibility of erroneous solutions.↵

^{5}All the calculations were done with Mathematica.