COVID-19 Trend and Forecast in India: A Joinpoint Regression Analysis

This paper analyses the trend in daily reported confirmed cases of COVID-19 in India using joinpoint regression analysis. The analysis reveals that there has been little impact of the nation-wide lockdown and subsequent extension on the progress of the COVID-19 epidemic in the country and there is no empirical evidence to suggest that relaxations under the third and the fourth phase of the lockdown has resulted in spiking the reported confirmed cases of COVID-19. The analysis also suggests that if the current trend continues, in the immediate future, then the daily reported confirmed cases of COVID-19 in the country is likely to increase to 21 thousand by 15 June 2020 whereas the total number of confirmed cases of COVID-19 will increase to around 422 thousand.


COVID-19 Trend and Forecast in India: A Joinpoint Regression Analysis Background
Total number of reported confirmed cases of COVID-19 in India crossed 100 thousand mark on 19 May 2010 according to the database maintained by the World Health Organization. The first confirmed COVID-19 case in India was reported on 30 January 2020 but no confirmed COVID-19 case was reported during 4 February 2020 through 1 March 2020. By 15 March 2020, more than 100 confirmed cases of COVID-19 were reported which increased to 500 by 24 March 2020 when the nation-wide lockdown was announced in the country. Since then, the number of daily reported confirmed cases crossed the 10000 mark by 14 April 2020 and the 50000 mark by 7 May 2020. An analysis of the trend in the daily reported confirmed cases of COVID-19 may provide an idea about how the COVID-19 epidemic has progressed in the country. The trend analysis also permit forecasting the likely trend in the reported confirmed cases of COVID-19 in the immediate future.
A trend analysis of daily reported confirmed cases of COVID-19 is also needed as it is widely claimed that the imposition of the nation-wide lockdown on 25 March 2020 has significantly decelerated the progress of COVID-19 epidemic in the country. It has also been claimed that loosening the restrictions under the nation-wide lockdown during its third and the fourth phase has primarily been responsible for the recently witnessed spike in the number of daily reported confirmed cases of COVID-19 in the country. It has even been argued that reimposing the harsh restrictions as part of the nationwide lockdown is the only way of stopping or decelerating the progress of COVID-19 epidemic despite the fact that the social and economic cost of nation-wide lockdown has been found to be quite complex and exorbitant. It has repeatedly been stressed that because of serious social and economic implications of the nation-wide lockdown, it cannot be prolonged.
One way of empirically examining these and many other claims regarding the progress of the epidemic is whether the trend in the daily reported confirmed cases of COVID-19 has changed after the imposition of the nation-wide lockdown or after loosening the restrictions under the nation-wide lockdown. If the trend in the daily reported confirmed cases of COVID-19 has changed after the imposition of the nation-wide lockdown, then it can be concluded that the nation-wide lockdown indeed has an impact on the progress of the epidemic. Similarly, if it is found that the trend in the daily reported confirmed cases of COVID-19 has changed after loosening the restrictions under the nation-wide lockdown, then it can be concluded that loosening of the restrictions has been responsible for the spike in reported confirmed COVID-19 cases in the country. However, if there is no change in the trend, then there is little empirical evidence to suggest that either the nationwide lockdown or loosening of restrictions under the nation-wide lockdown has any telling impact on the progress of the COVID-19 epidemic.
A review of the daily reporting of the confirmed COVID-19 cases in India reveals that during the 28 days from 4 February 2020 through 1 March 2020, no confirmed case of COVID-19 case was reported in the country (Table 1). Moreover, during the period 2 March 2020 through 31 March 2020, daily reporting of confirmed COVID-19 cases has been highly inconsistent. For example, no confirmed case of COVID-19 was reported on 3 March, 20 March and 28 March 2020 whereas on 29 March 2020 alone, 255 confirmed cases of COVID-19 were reported. These inconsistencies in the reporting of daily confirmed cases of COVID-19 may bias any analysis of the trend in the daily reported confirmed cases of COVID-19. It is therefore necessary that these irregular fluctuations in the daily reporting of confirmed cases of COVID-19 are first ironed out before any analysis of the trend in the reported confirmed cases of COVID-19 is carried out.
One approach to minimise the impact of reporting inconsistencies in the analysis of the trend in daily reporting of confirmed COVID-19 cases is to use moving average instead of actual daily reported confirmed cases of COVID-19. The same approach has been followed in the present analysis. To minimise irregular fluctuations in the reporting of COVID-19 cases, five-day moving average has been used for the trend analysis. In other words, the reported confirmed cased of COVID-19 in a day used in the present analysis are actually the average of the reported confirmed cases of COVID-19 two days prior to the day in question; two days after the day in question and the reported confirmed cases of COVID-19 on the day itself. For example, the reported confirmed cases of COVID-19 on 3 March 2020 used in the present analysis are actually the simple average of reported confirmed cases of COVID-19 on 1 March through 5 March 2020, etc.
This paper analyses the trend in daily reported confirmed cases of COVID-19 in the country using joinpoint regression (Kim et al, 2000). Joinpoint regression is used to study the trend that varies over time. It first identifies the time point(s) at which the trend in the reported confirmed cases of COVID-19 has changed or the joinpoint(s). Once the joinpoint(s) are identified, then the average per cent change between two joinpoints is calculated to reflect how the trend in the reported confirmed cases of COVID-19 has varied over time. The goal of the joinpoint regression analysis is not to provide the statistical model that best fits the time series data. Rather, the purpose of the joinpoint regression analysis is to provide the model that best summarises the trend in the data (Marrot, 2010). The underlying assumption of joinpoint regression is that trend in the data is not the same throughout the period under reference.

Joinpoint Regression Model
The joinpoint regression model is essentially different from the conventional piecewise or segmented regression model in the sense that the identification of joinpoint(s) and their location(s) is estimated within the model and are not set arbitrarily as is the case with the piecewise or segmented regression analysis. The minimum and the maximum number of joinpoint(s) are, however, set in advance but the final number of joinpoint(s) or the time point(s) when the trend changes is determined statistically. The model first identifies the time point(s) when there is a change in the trend and calculates the average percentage change (APC) which reflects the rate of change between two joinpoint(s). When the number of joinpoint(s) is zero, the model reduces to simple linear regression model.
Let y i denotes the infant mortality rate for the year t i such that t 1 <t 2 <...<t n . Then the joinpoint regression model is defined as (1) where and k 1 <k 2 .......<k j are joinpoints. The details of joinpoint regression analysis are given elsewhere (Kim et al, 2000;Kim et al, 2004).
Joinpoint regression analysis has commonly been used when the temporal trend of a given quantity, like incidence, prevalence and mortality is of interest (Tyczynski and Berkel, 2005;Doucet, Rochette and Hamel, 2016;John and Hanke, 2015;Chaurasia, 2020). However, this method has generally been applied with the calendar year as the time scale (Akinyede and Soyemi, 2016;Mogos et al, 2016;Chatenoud et al, 2015;Missikpode at al, 2015). The joinpoint regression analysis can also be applied in epidemiological studies in which the starting date can be easily established such as the day when the disease is detected for the first time as is the case in the present analysis. Joinpoint regression analysis can, therefore, be applied in the public health framework also and can explore whether the number of reported cases of a specific disease has decreased after the introduction of an intervention to check the disease. In the present context, the application of the joinpoint regression analysis can answer the question whether the imposition of the nation-wide lockdown has resulted in the decrease in the reported confirmed cases of COVID-19 in the country or not. If the day of introducing the intervention turns out to be a joinpoint, then, a comparison of daily per cent change in the reported confirmed cases of COVID-19 before and after the joinpoint can tell whether the intervention has been able to bring down the reported confirmed cases of COVID-19 or not.
Actual calculations in the present analysis were carried out using the Joinpoint Regression Program developed by the Statistical Research and Application Branch of the National Cancer Institute of the United States of America (NIC, 2013). The software requires specification of minimum (0) and maximum number of joinpoints (>0) in advance. In the present analysis, the minimum number of joinpoints is specified as 0 while the maximum number of joinpoints have been specified as 5. The programme starts with the minimum number of joinpoints (0, which is actually a straight line and the model is simple linear regression model) and tests whether more joinpoints are statistically significant and must be added to the model (up to the pre-specified maximum number of joinpoints). The tests of significance is based on a Monte Carlo Permutation method (Kim et al, 2000).
The Bayesian Information Criterion (BIC) was used to identify the number of joinpoints in the model. There are other methods also for the purpose. These include the permutation test and the data driven BIC methods. Relative merits and demerits of different methods are discusses elsewhere (NIC, 2013). The permutation method is regarded as the best method but it is very highly computationally intensive. The BIC method, on the other hand, is less computationally intensive. This method selects that model for which the object function, which is either the sum of the model fit error or the penalty term is minimised.

Trend in Daily Reported Confirmed Cases of COVID-19
Results of the joinpoint regression analysis of the five-days moving average of the daily reported confirmed cases of COVID-19 in India for the period 1 March 2020 through 23 May 2020 are summarised in table 2 and figure 1. The five-days moving average is centred at the mid-point of the fiveday interval. For example, the five-day moving average of the period 1 March through 5 March 2020 is centred on 3 March 2020. In other words the joinpoint regression analysis is carried out for the period 3 March 2020 through 21 May 2020, although, it covers the data on daily reported confirmed cases of COVID-19 1 March 2020 through 23 May 2020. The period prior to 1 March 2020 has not been included in the analysis as the daily reported confirmed cases of COVID-19 during the period 30 January 2020 through 1 March 2020 have mostly been found to be zero.
The application of the joinpoint regression analysis divides the duration 1 March 2020 through 23 May 2020 or a period of 84 days into five time segments and the trend in the daily reported confirmed cases of COVID-19 cases is found to be different in different time segments. During the first five days of the period under reference -3 March 2020 (day 1) through 7 March 2020 (day 5), the trend in the daily reported confirmed cases of COVID-19 in the country has been found to be negative which means that daily reported confirmed cases of COVID-19 in the country actually decreased, instead increased during this period, on average, at a rate of around 8 per cent per day. This decrease in the reported confirmed cases of COVID-19 may be attributed to reporting inconsistencies. The daily per cent change during this period has, however, been found to be statistically significant. On the other hand during the next 10 days -from 7 March 2020 (day 5) through 16 March 2020 (day 14)the daily reported confirmed cases of COVID-19 increased, on average, at a rate of almost 15 per cent per day. The increase in the daily reported confirmed cases of COVID-19 accelerated further during the next eight days -from 16 March 2020 (day 14) through 23 March 2020 (day 21) -when the daily reported confirmed cases of COVID-19 in the country increased at a rate of more than 28 per cent per day. However, the increase in the daily reported confirmed cases of COVID-19 decelerated during the period 23 March 2020 (day 21) through 26 March 2020 (day 24) at a rate of almost 9 per cent per day, although the daily per cent change was not found to be statistically significant. The daily reported confirmed cases of COVID-19 increased again at a rate of 27 per cent per day, on average, during the next 10 days -from 26 March 2020 (day 24) through 4 April 2020 (day 33). After 4 April 2020, however, there has been no change in the trend in the daily reported number of confirmed cases of COVID-19 till 21 May 2020. During 4 April 2020 through 21 May 2020, the daily reported confirmed cases of COVID-19 increased, on average, at a rate of almost 5.2 per cent per day.
The joinpoint regression analysis suggests that the trend in the daily reported confirmed cases of COVID-19 changed statistically significantly at the 5 th day (7 March 2020); 14 th day (16 March 2020); 24 th day (26 March 2020); and 33 rd day (4 May 2020) of the 80 days period beginning 3 March 2020. The nation-wide lockdown in the country was imposed on 25 March 2020 initially for a period of 21 days which was then extended to 3 May 2020 on 15 April 2020. On 4 May 2020, the lockdown was again extended up to 17 May 2020 but with a relaxed set of restrictions which, on 18 May 2020, was further extended to 31 May 2020 with even more relaxed set of restrictions. The analysis suggests that the day of the fifth change in the trend in daily reported confirmed cases of COVID-19 only matched with the date of third extension of the nation-wide lockdown. The day of the change in the trend in daily reported confirmed cases of COVID-19 has not been found to be linked with the imposition of the nation-wide lockdown on 25 March 2020 as well as its extension on 15 April 2020. There has also been no change in the trend in the daily reported confirmed cases of COVID-19 after the fourth extension of the nation-wide lockdown on 18 May 2020 when restrictions under the nationwide lockdown were significantly loosened.
After 4 April 2020, the daily reported confirmed cases of COVID-19 in the country have been found to have increased, on average, at almost 5.2 per cent per day. In other words, the joinpoint regression analysis of the trend in daily reported confirmed cases of COVID19 does not support the claim that the imposition of the nation-wide lockdown on 25 March 2020 had resulted in a deceleration in the increase in daily reported confirmed cases of COVID-19. At the same time, the trend analysis also does not support the claim that the relaxations in the restrictions under the nation-wide lockdown has resulted in the spiking of daily reported confirmed cases of COVID-19.

Forecasting Number of COVID-19 Cases
The average daily percent change in the reported confirmed cases of COVID-19 during the period 4 April 2020 through 21 May 2020 permit forecasting the daily reported confirmed cases of COVID-19 under the assumption that there is no change in the trend. This exercise suggests that by 15 June 2020, the daily reported confirmed cases of COVID-19 will increase to 21243 with a 95 per cent confidence interval of 18246 -24725 (Table 3 and Figure 2). This increase in the reported confirmed cases of COVID-19 may change only when there is a significant change in the trend in the reported confirmed cases of COVID-19. A significant change in the trend is possible only when a new set of interventions are introduced to combat COVID-19 epidemic in the country. It is already being emphasised that the nation-wide lockdown imposed on 25 March 2020 is now getting increasingly irrelevant in checking the progress of the epidemic because of a host of factors, the most important of which is that the national-wide lockdown could not prevent large scale movement, especially of migrant workers from urban areas to rural hinterland. It is, therefore, being stressed that population-wide testing for COVID-19 followed by active contact tracing and isolation of the positive cases and their contacts is necessary to stop the increase and even decrease the reported confirmed cases of COVID-19 in the country. The need for such a strategy stems from the fact that almost 40 per cent of the individuals tested positive for COVID-19 are found to be asymptomatic. Chaurasia (2020a) has suggested a cluster-based approach of population-wide testing for COVID-19 which significantly reduces the number of tests to be done.
The forecast of the daily reported confirmed cases of COVID-19, on the basis of the joinpoint regression analysis present here also suggests that the total number of confirmed COVID-19 cases in the country are likely to increase to almost 422 thousand by 15 June 2020 with a 95 per cent confidence interval of around 376 thousand to around 473 thousand. According to the latest information available from the database maintained by the World Health Organization, the total number of confirmed COVID-19 cases in the country has crossed the 138 thousand mark by 25 May 2020. This implies that during the next 20 days, there will be most probably around 284 thousand additional COVID-19 cases in the country. This scenario can be change through introducing appropriate interventions to halt or to even reverse the progress of the epidemic. The good sign, however, is that recovery rate of the disease in the country is quite encouraging while the risk of death from the disease is quite low by international standards. 7 reuse, remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, The copyright holder has placed this this version posted June 2, 2020. ; https://doi.org/10.1101/2020.05.26.20113399 doi: medRxiv preprint

Conclusions
The present analysis, based on the daily reported confirmed cases of COVID-19, suggests that there has virtually been little impact of the nationwide lockdown and subsequent extension and relaxation in restrictions on the progress of the COVID-19 epidemic in India. There has also been little empirical evidence to suggest that relaxation in the restrictions under the third and the fourth phase of the nation-wide lockdown has resulted in spiking the reported confirmed cases of COVID-19 in the country. The analysis also suggests that if the trend in the reported confirmed cases of COVID-19 during the period 4 April through 21 May 2020 continues in the immediate future, then the daily reported confirmed cases of COVID-19 is likely to increase to around 21 thousand by 15 June 2020 whereas the total number of confirmed cases of COVID-19 will increase to around 422 thousand. This trend can be changed or reverted by introducing appropriate interventions that may help in containing the spread of the disease. In this context, population-wide testing for COVID-19 along with isolation of positive cases and contacts to the positive cases appears to be the need of the time.  10 reuse, remix, or adapt this material for any purpose without crediting the original authors. preprint (which was not certified by peer review) in the Public Domain. It is no longer restricted by copyright. Anyone can legally share, The copyright holder has placed this this version posted June 2, 2020. ; https://doi.org/10. 1101/2020