Prediction of 2019-nCov in Italy based on PSO and inversion analysis

Although China achieved an early victory of controlling the novel coronavirus (2019-nCov), the overseas situation is overwhelming negative, especially in Italy. Up to March 11, 2020, 2019-nCov thoroughly broke out in Italy with over 10,000 confirmed cases notwithstanding the gradually block of the country since March 9, 2020. Estimation of possible infection population and prospective suggestion of handling spread based on exist data are of crucial importance. Considering of the biology parameters obtained based on Chinese clinical data in Wuhan, other scholars' work and real spread feature of 2019-nCov in Italy, we built a more applicable model called SEIJR with log-normal distributed time delay to forecast the trend of spreading. Adopting Particle Swarm Optimization (PSO), we estimated the early period average spreading velocity (0) and conducted inversion analysis of time point (T0) when the virus first hit the Italy. Based on fixed 0 and T0, we then obtained the average spreading velocity 1 after the lock by PSO. For the aim of offering expeditious advice, we generated the prediction trends with different which we considered would be helpful in addressing the infection. Not only solved the complex, nondifferentiable equation of epidemic model, our research also performs well in inversion analysis based on PSO which conveys informative outcomes for further discussion on precatious action. To conclude, the first day of spread is around February 1, 2020 with the early period average spreading velocity 0=0.330 which is higher than most cities in China except Wuhan. After locking the country and attaching great attention to public precaution, the 1 sharply descended to 0.278, indicting the effectiveness of these measures. Furthermore, in order to cope the disease before mid-April, take actions to control the under 0.25 is necessary. Code can be freely download from https://github.com/Summerwork/2019-nCov-Prediction.

Republic of China, 2020). China as the first country faced by the outbreak of the severe disease, it took 35 strict but effective action to contain the spread of 2019-nCov and attained apparent success till now. [21] 36 Many related works have been done in prediction and precaution via constructing proper model and 37 analyzing parameters. [21] [20] [19] 38 While in the other parts worldwide, the menacing disease just became to spread [7], especially in 39 countries with no preparation and experienced measures for suppressing the possible large-scale infection. 40 In this article, we take Italy which is now experiencing severe situation of 2019-nCov as example to 41 conduct analysis with the aim of offering utilizable suggestion. In the first period (before March 9, 2020), 42 we attempt to inverse the virus spread timeline in Italy and the early period average spreading velocity by 43 1 Figure 1. Flow chart adopting PSO to optimize the parameters based on our SEIJR model and existing data (European Centre 44 for Disease Prevention and Control). In the second period (only use data from March 9-16, 2020), we 45 optimize the average spreading velocity in order to show the effect of country blockade. In the last period 46 of our research, we demonstrate latent trends of confirmed cases with various average spreading velocity 47 which of vital importance in controlling the disease. The whole flow chart is illustrated in Figure 1. We collected the daily reported confirmed diagnosed data from the website of European Centre for 51 Disease Prevention and Control (European CDC) . All these data is public for everyone.

52
Based on the need of our analysis, we preprocessed the data by adding the daily reported confirmed 53 diagnosed cases to obtain the accumulative amount for following inversion of parameters and prediction. 54 Here are details about the processed data we used in this article (Table 1).
CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2020. . 3 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2020

56
Disease transmission is a complex process with multiple variables and uncertainties making it unable to 57 be accurately solved and predicted [16], [11] [6]. However, models are feasible in forecasting for infectious 58 diseases when different characteristics parameters [18] [17] like transmission mode, immunization mode, 59 mortality and average spreading velocity are offered. Classical models for infectious diseases include SIR 60 model [2] and SEIR model, etc. Considering the actual situation in Italy and the transmission 61 characteristics of 2019-nCov obtained in China, this paper built the SEIJR model with log-normal 62 distributed time-delay terms [15] [3]based on the SEIR model [14] [12]. Figure 2 illustrates the SEIJR 63 model. The model describes the problem by assuming population consists of six types (accumulative 64 value): susceptible population , exposed population , infectious population , confirmed , recovered 65 population and dead population . is average spreading velocity, is diagnose rate, 1 , 2 are die 66 rate and is cure rate. 1 ( ) is the time of incubation period [5] need to become , 2 ( ) is the time of 67 waiting period need to become and 3 ( ) is the duration of hospitalization [5] need to become .

72
In the model, only and have ability to infect which is a process of contact infection with transient 73 time. indeed infected but shows no symptoms of 2019-nCov and then transfers to after an incubation 74 period 1 . has symptoms like fever, cough and shortness of breath. Because the pre-virus symptoms are 75 not obvious [5] and the uneven medical facilities in Italy, will be confirmed as after a period 2 . Due 76 to the seriousness and infectivity of the virus, it can be considered that when it becomes , will be 77 immediately isolation and lose its ability of infection. Treatment will be started immediately after 78 confirmed. will recovery and become after a duration of hospitalization 3 . Because of the 2019-nCov 79 has certain lethality, it needs some more assumptions: The number of deaths during the period of is too tiny to be considered, only need to consider the 82 mortality during the period of and ; 83 4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.08.20095869 doi: medRxiv preprint The data during the period of is unavailable. We consider that to and to have the same delay 85 time 2 which means the only difference between them is proportion; The official organization and medical institutions do not give any information on how long for patients in 88 the treatment stage will die, but there exists clinical information that for a confirmed patient how long 89 the patient is needed to be cured [5]. As the same way to , to have the same delay time 3 only 90 with different proportion.

91
People in can be diagnosed and receive treatment with diagnose rate. Otherwise, they will die 92 with the die rate 1 = 1−. The recover rate for is and the dead rate is 2 = 1 − . The total mortality 93 proportion is = 1 + 2 . After recovering, people in R will not go out because they are in a frail state and will be considered as 96 isolation. 97

98
Combining the basic principles of epidemiology and etiology, it can get that the time 2 from infection to 99 diagnosis, which approximately follows the log-normal distribution [8]. The assumption can be applied to 100 density function corresponding to 1 is: For 2 , the data in Italy is still unavailable, but the relevant distribution function is given in the regard it as the point when enters , which means the time given in that article corresponds to the 2 in 111 this article. As the same way, the log-normal density function corresponding to 2 is: For 3 , the Italian official organizations and medical institutions still lack clinical information and 113 official statistics, so we also assumes that 3 follows the log-normal distribution. However, Zhong NS et.al 114 gave some relevant data: the duration of hospitalization which is 3 , the median is 12 days, and the 115 5 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.08.20095869 doi: medRxiv preprint quartiles are 10 and 14 days [5]. Adopting the same method used in 1 estimation, we can get the 116 distribution density function corresponding to 3 is: where ( ) is the term which has time-delay, and ( ) is the distribution density mentioned before.

122
Here we finally get the precise differential equation of SEIJR model as: 3 Methods

125
The previous differential Eq. (6) corresponding to the SEIJR model is a form with integrals and 126 independent variables on the integral limit which from the time-delay function term,so there is no 127 analytical solution. For this case, numerical methods are useful. We combine iteration and degree four 128 Runge-Kutta to generate the numerical solution with the initial condition. Here is the main principles of 129 Runge-Kutta fourth-order method [8], let the differential equation have the form as follow: Then its iterative formula is: 6 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

133
Usually the epidemic model's descriptive function was derivative-based without considering time delay or 134 just assume fixed linear time delay, such as SIR model. In this article, the SEIJR model solved by 135 combining iteration and degree four Runge-Kutta mentioned before. Accounting for this, the function of 136 least square method (LSE) employed in addressing this problem for different periods are denoted as 137 follow: where 1 is LSE of the first period. 2 is LSE of the second period. nondifferentiable [13]. Traditional method based on derivation is infeasible for minimization. While 142 approaches such as annealing algorithm [10] to search for parameters is more calculative expensive than 143 the particle swarm optimization (PSO). information from the environment. In PSO algorithm, the velocity of individual is dynamically changed 147 considering its previous flying experience. 148 The algorithm consists of three main parts: individual best, global best and individual optimization 149 based on the best particle of whole population [9]. In this article, 1 and 2 are the fitness function 150 during the first and second period. The main process of conducting this algorithm and basic parameters 151 are respectively shown in Figure 3 and Table 2.  During this period of our work, we adopt PSO using 1 as its fitness function to generate the optimal 155 0 , 0 . As shown in the Figure 4, the best result is 0 =21 (after rounding) and 0 =0.33. After obtained 156 the greatest 0 , 0 , we draw the prediction curves for further comparison in the following section.   down from 0 . The Figure 6 shows the predicted population * 2 with 1 from March 9 to 25, 2020 and 165 2 .  Figure 7 gives prediction curves with different with the aim for further discussion about controlling the 168 disease before mid-April. 169

9
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2020  It is clearly seen that the fitting curve of in the SEIJR model fits well with the confirmed diagnosis 172 data of the Italian official statistics. 0 = 21 represents the initial value of the model which indicates the 173 first appeared 21 days before February 22 that is February 1. According to the information reported 174 by the Italian government, the first case in Italy, when appeared, was January 31. Two patients from 175 Wuhan, China arrived in Italy by air at January 23 and visited other cities in Italy. They finally arrived 176 in Rome, feeling physical discomfort at January 30 and was confirmed diagnosis then isolated at January 177 31. It is reasonable to speculate that they had carried the 2019-nCov in Wuhan before arriving in Italy in 178 January 23, and continued to spread in Italy for eight days after January 23 until January 31. The first 179 case of reported in official data appeared at January 23, and the model's conclusion appeared at  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.08.20095869 doi: medRxiv preprint Applying PSO to our SEIJR model with log-normal distributed time delay, we obtained the convincing 200 start time (around February 1, 2020) of 2019-nCov and the average spreading velocity ( 0 =0.330) at the 201 early stage. We compared the average spreading velocity during the early period and following period, a 202 conspicuous decrease attributed to the effective measures was found. Based on the prediction interval of 203 possible infected population of different , we strongly recommend Italy to keep under 0.25 if they 204 want the situation take a turn for the better even ended before mid-April. 205