## ABSTRACT

COVID-19 pandemic has underlined the impact of emergent pathogens as a major threat for human health. The development of quantitative approaches to advance comprehension of the current outbreak is urgently needed to tackle this severe disease. In this work, several mathematical models are proposed to represent COVID-19 dynamics in infected patients. Considering different starting times of infection, parameters sets that represent infectivity of COVID-19 are computed and compared with other viral infections that can also cause pandemics.

Based on the target cell model, COVID-19 infecting time between susceptible cells (mean of 30 days approximately) is much slower than those reported for Ebola (about 3 times slower) and influenza (60 times slower). The within-host reproductive number for COVID-19 is consistent to the values of influenza infection (1.7-5.35). The best model to fit the data was including immune responses, which suggest a slow cell response peaking between 5 to 10 days post onset of symptoms. The model with eclipse phase, time in a latent phase before becoming productively infected cells, was not supported. Interestingly, both, the target cell model and the model with immune responses, predict that virus may replicate very slowly in the first days after infection, and it could be below detection levels during the first 4 days post infection.

A quantitative comprehension of COVID-19 dynamics and the estimation of standard parameters of viral infections is the key contribution of this pioneering work.

## INTRODUCTION

Epidemics by infectious pathogens are a major threat to humankind. The year 2020 has uncovered one of the biggest pandemics in history, the novel coronavirus (COVID-19) that was first reported in Wuhan, Hubei Province, China in December 2019. Thus far, about 267013 confirmed cases and about 11201 deaths were reported worldwide [1]. While China has made a large effort to shrink the outbreak, COVID-19 has developed into a pandemic in 185 countries. Case numbers are alarming as the virus spreads in Europe, Iran, South Korea, and Japan. In fact, the pandemic epicentre changed to Europe on 13 of March 2020.

Coronaviruses can be found in different species of animals (*e*.*g*. bats and camels) and can evolve and infect humans by droplets from coughing or sneezing. Previous outbreaks to COVID-2019 were the Severe Acute Respiratory Syndrome (SARS-CoV), reported in Asia in February 2003 resulting in 8422 cases with a case-fatality rate of 11% [1]. Later, in 2012, the Middle East respiratory syndrome (MERS-CoV) was identified in Saudi Arabia and infected 2506 people, killing 862 between 2012 and 2020 [1]. Metagenomics studies previous to the COVID-19 outbreak envisaged the possibility of future threats due to the identification of several sequences closely related SARS-like viruses circulating in the Chinese bat populations [2, 3].

Unfortunately, no vaccine or antiviral drug is likely to be available soon. In fact, either monoclonal antibody or vaccine approaches have failed to neutralize and protect from coronavirus infections [3]. Therefore, individual behaviour (*e*.*g* early self-isolation and social distancing) as well as preventive measures such as hand washing, covering when coughing are critical to control the spread of COVID-19 [4]. Additionally to these measures, several travel restrictions and quarantines have taken place in many countries around the globe.

Epidemiological mathematical models have been developed to help policy makers to take the right decisions [4]. These have highlighted that social distancing interventions to mitigate the epidemic is a key aspect. There are many epidemiological unknowns with 2019-nCoV [4]. The case fatality rate for COVID-19 is about 0 3–1% [1]. However, adjusted estimation by [5] indicates that COVID-19 mortality rate could be as high as 20% in Wuhan. In its early stages, the epidemic have doubled in size every 7.4 days [6]. Moreover, the basic reproductive number was estimated to be 2.2 (95% CI, 1.4 to 3.9) [6]. Based on the relative long incubation period for COVID-19, about 5–6 days [1], Anderson *et al*. [4] suggested that might be considerable pre-symptomatic infectiousness.

While there are many mathematical models developed at epidemiological level for COVID-19, there is none so far at within-host level to understand COVID-19 replication cycle (Fig.1) and its interactions with the immune system. Among several approaches, the target cell model has served to represent several diseases such as HIV [7–10], Hepatitis virus [11, 12], Ebola [13, 14], influenza [15–18], among many others. A detailed reference for viral modelling can be found in [19]. Very recent data from infected patients with COVID-19 has enlighten the within-host viral dynamics. Zou *et al*. [20] presented the viral load in nasal and throat swabs of 17 symptomatic patients. Interestingly, COVID-19 replication cycles may last longer than flu, about 10 days or more after the incubation period [4, 20]. Here, we contribute to the mathematical study of COVID-19 dynamics at within-host level based on data presented by Wolfel *et al*. [21].

## RESULTS

Using ordinary differential equations (ODEs), different mathematical models are presented to adjust the viral kinetics reported by Woelfel *et al*. [21] in infected patients with COVID-19. Viral load [21] was sampled from throat swab cultures and measured in *Copies/mL, g Swab*, at Log10 scale. To dissect the COVID-19 dynamics observed in infected patients, mathematical models are employed as both a quantitative recapitulation of experimental data and as a tool to prioritize mechanisms on the basis of mathematical models and the Corrected Akaike Information Criterion (AICc) for model selection. The cost function (14) is minimized to adjust the model parameters based on the Differential Evolution (DE) algorithm [22].

### Exponential Growth and Logarithmic Decay Model

Based on the experimental data [21], the viral dynamic is divided into two parts, exponential growth (*V*_{g}) and decay (*V*_{d}) modelled by equations (1) and (2), respectively.

Viral growth is assumed to start at the onset of symptoms, with initial viral concentration *V*_{g}(0). The parameter *ρ* is the growth rate of the virus. The parameter *η* quantifies the decay rate of the virus, while *V*_{d}(0) the initial value of the virus in decay phase. Note that the growth phase of the virus was measured only in two patients (A, and B) [21].

Simulation are shown in Fig.2 and numerical results are presented in Table 1. The mean growth rate (*ρ*) is estimated as 3.98 (1/day) while the initial condition estimate is approximately 0.31 (Copies/mL). The mean decay rate of the virus (*η*) is around 0.95 (1/day), with the slowest rate estimate of 0.63 (1/day) presented for patients B, E, and F. The fastest decay rate was presented in the patient I with an estimate of 2.51 (1/day). This slow decay rate may explain the long duration of the virus (11-22 days) observed in the patients after the onset of symptoms [21].

### Target Cell Model

The mathematical model used here to represent coronavirus dynamics is based on the target cell-limited model [19, 23, 24]. Coronavirus can replicate in a variety of cell types, including epithelial cells. The coronavirus infection model is as follows:

Host cells can be in one of following states: susceptible (*U*) and infected (*I*). Viral particles (*V*) infect susceptible cells with a rate *β* ((Copies/mL)^{−1} day^{−1}). Once cells are productively infected, they release virus at a rate *p* (Copies/mL day^{−1} cell^{−1}) and virus particles are cleared with rate *c* (day^{−1}). Infected cells are cleared at rate *δ* (day^{−1}) as consequence of cytopathic viral effects and immune responses.

Coronaviruses infect mainly in differentiated respiratory epithelial cells [25]. Previous mathematical model for influenza [17] have considered about 10^{7} initial target cells (*U* (0)). Initial values for infected cells (*I*(0)) are taken as zero. *V* (0) is determined from estimations in Table 1. Note that *V* (0) cannot be measured as it is below detectable levels (about 100 *Copies/m*) [21].

Viral kinetics are measured after the on-set of symptoms [21], however, it is unknown when the initial infection took place. Patients infected with MERS-CoV in [26] showed that the virus peaked during the second week of illness, which indicated that the median incubation period was 7 days (range, 2 to 14) [26]. For parameter fitting purposes, we explore three different scenarios of initial infection day (*t*_{i}), that is, −14, −7, −3 days before the onset of symptoms for patients A and B, see Fig. 3.

Infectivity can be defined as the ability of a pathogen to establish an infection [27]. To quantify infectivity, the within-host reproductive number (*R*_{0}) was computed. *R*_{0} is defined as the expected number of secondary infections produced by an infected cell [28]. When *R*_{0} *<* 1, one infected individual can infect less than one individual. Thus, the infection would be cleared from the population.

Otherwise, if *R*_{0} *>* 1, the pathogen is able to invade the target cell population. This epidemiological concept has been applied to the target cell model (3)-(5), with

Previous studies [13, 29, 30] provided estimates of the *infecting time* (*t*_{inf}), that represents the time required for a single infectious cell to infect one more cell. Viruses with a shorter infecting time have a higher infectivity [29, 30]. From equations (3)-(5), *t*_{inf} can be explicitly computed as:

Assuming day of infection at day 0 post symptom onset (pso) would result in very high reproductive numbers (*R*_{0}) and a high infection rate (*β*) for patients A and B as presented in Table 2. Alternatively, assuming the initial day of infection is either day −14 or −7 pso, then the rate of infection of susceptible cells (*β*) would be slow but associated with a high replication rate (*p*).

Strikingly, Fig. 3(b) reveals a long period (about 4 days post infection) of viral replication below detectable levels. Independently of the starting infection time (*t*_{i}), numerical results at the Table 2 reveal very consistent reproductive numbers for patients A and B (approximately 11),implying that COVID-19 would invade most of the susceptible target cells. Remarkably, the infecting time *t*_{inf} is slow, about 30 hours. This may explain why COVID-19 can last several days (12-22 days pso) in infected patients [21].

### Target Cell Model with Eclipse Phase

To represent the time frame of the infection more adequately, an additional state is added where newly infected cells spend time in a latent phase (*E*) before becoming productively infected cells (*I*) [29, 31]. This can be written as follows:

Cells in the eclipse phase (*E*) can become productively infected at rate *k*. Holder *et al*. [29] considered different time distributions for the eclipse phase and viral release by infected cells for influenza. Their results showed that the time distribution of the eclipse phase and viral release directly affect the parameter estimation. For COVID-19, Fig.4 the eclipse phase model (AIC*≈*34) does not improve the fitting respect to the target cell model (Table 2) even when very long eclipse phase periods are assumed (*e*.*g* 100 days), implying that this mechanism could be negligible on COVID-19 infection.

### Mathematical Model with Immune Response

Previous studies have acknowledged the relevance of the immune T-cell response to clear influenza [17, 32–36]. Due to identifiability limitations for the estimation of the parameters of the target cell model using viral load data, a minimalistic model was derived in [37, 38] to represent the interaction between the viral and immune response dynamics. The model assumes that the virus (*V*) level induces the proliferation of T cells (*T*) as follows:

Viral replication is modelled with a logistic function with maximum carrying capacity *K* and replication rate *p*. The virus is cleared at a rate *c*. The term *c*_{T} *V T* represents the rate of killing of infected cells by the immune response. T cell homoeostasis is represented by *s*_{T} = *δ*_{T} *T* (0), where *T* (0) is the initial number of T cells and *δ*_{T} is the half life of T cells. The steady state condition must be satisfied to guarantee the T cell homeostatic value *T* (0) = *s*_{T} */δ*_{T} in the absence of viral infection.

*K* is the maximum viral load for each of the patients in [21]. The half life of T cells is approximately 4-34 days [39], therefore we take *δ*_{T} = 2.9 *×* 10^{−2}. T cells can proliferate at a rate *r*, and we assumed that the activation of T cell proliferation by *V* follows a log-sigmoidal form with half saturation constant *k*_{T}. The coefficient *m* relates to the width of the sigmoidal function. While different values of *m* were tested, *m* = 2 rendered a better fit.

Fig.5 show results of parameter fitting for three different scenarios assuming the initiation of the infection (*t*_{i}) was at −14, −7, and 0 dpso. Panels (a) and (c) of Fig.5 shows that the model (12)-(13) gives a better fitting than previous models (Fig.2-4). Furthermore, AICs values for patient A and B highlight that *t*_{i} = −15 dpso give the best fitting. For presentation purposes, numerical results for patient A and B are the only portrayed in Fig.4. The summary of fitting procedures at *t*_{i} = −15 dpso is presented in Table 3. Independently of the starting infection day, the immune response by T cells peaks between 5 to 10 dpso. Interestingly, the longer the period between infection time to the onset of symptoms, the higher the immune response.

## DISCUSSION

The novel coronavirus (COVID-19) first reported in Wuhan in December 2019 has paralysed our societies, leading to self isolation and quarantine for several days. Indeed, COVID-19 is a major threat to humans, with alarming levels of spread and death tolls, in particular on the eldery. The WHO situation report published on 21 March 2020 reported 267013 confirmed cases and 11201 deaths [1]. COVID-19 is the first pandemic after the H1N1 “swine flu” in 2009 [1]. While many mathematical models have concentrated on the epidemiological level predicting how COVID-19 would spread, this paper aims to model COVID-19 dynamics at the within-host level to quantitative COVID-19 infection kinetics in humans.

Data from [26] showed that MERS-CoV levels peak during the second week with a median value of 7.21 (log10 copies/mL)in the severe patient group, and about 5.54 (log10 copies/mL) in the mild group. For SARS, the virus peaked at 5.7 (log10 copies/mL) between 7 to 10 days after onset [40]. For COVID-19, the viral peak was approximately 8.85 (log10 copies/mL) before 5 dpso [21]. Liu *et al*. [41] found that patients with severe disease reported a mean viral load on admission 60 times higher than that of the mean of mild disease cases, implying that higher viral loads relate clinical outcomes. Additionally, higher viral load persisted for 12 days after onset [41].

Using the target cell model, Nguyen *et al*. [13] computed for Ebola infection an average infecting time of 9.49 hours, while Holder *et al*. [29] reported that infecting time for the wild-type (WT) pandemic H1N1 influenza virus was approximately 0.5 hours [29]. Here, based on the results of the target cell model in Table 2, we found that COVID-19 infecting time between cells (mean of 30 days approximately) would be slower than those reported for Ebola (about 3 times slower) and influenza (60 times slower). The reproductive number for influenza in mice ranges from 1.7 to 5.35 [42], which is consistent with the values reported for COVID-19.

Interestingly, both of our models (the target cell model (3)-(5) and the model with immune response (12)-(13)) when fitted to the patient A data, predict that the virus can replicate below detection levels for the first 4 dpi. This could be an explanation of why infected patients with COVID-19 would take from 2-14 dpi to exhibit symptoms.

The model with immune system (Fig.4(b and d)) highlights that the T cell response is slowly mounted against COVID-19 [4]. Thus, the slow T cell response may promote a limit inflammation levels [42], which might be a reason to the observations during COVID-19 pandemic of the detrimental outcome on French patients that used non-steroidal anti-inflammatory drugs (NADs) such as ibuprofen. However, so far, there is not any conclusive clinical evidence on the adverse effects by NADs on COVID-19 infected patients.

The humoral response against COVID-19 is urgently needed to evaluate the protection to reinfections. A longitudinal study in rhesus monkeys by Bao *et al*. [43] uncovered that infected monkeys presented viral replication at 7 days post-infection (dpi). Significant increase of specific IgG were detected at 14, 21 or 28 dpi. Infected monkeys were re-challenged after specific antibody tested positively and symptoms vanished. Monkeys with re-exposure presented no recurrence of COVID-19, highlighting that protection can be presented to subsequent exposures. Regarding antiviral drugs, Remdesivir treatment has shown a good prophylactic effect during the first 24 hours post MERS-CoV infection in a non-human primate model [44]. Furthermore, benefits has been reported for therapeutic treatment if provided during 12 hours MERS-CoV infection [44]. Our study here mainly addressed T cell responses, therefore, future modelling attempts should be directed to establish a more detailed model of antibody production and cross-reaction [45] as well as *in silico* testing of different antivirals [46].

There are technical limitations in this study that need to be highlighted. The data for COVID-19 kinetics in [21] is at the onset of symptoms. This is a key aspect that can render biased parameter estimation as the target cell regularly is assumed to initiate at the day of the infection. In fact, we could miss viral dynamics at the onset of symptoms. For example, from throat samples in Rhesus macaques infected with COVID-19, two peaks were reported on most animals at 1 and 5 dpi [47].

In a more technical aspect using only viral load on the target cell model to estimate parameters may lead to identifiability problems [48–51]. Thus, our parameter values should be taken with caution when parameters quantifications are interpreted to address within-host mechanisms. For the model with immune system, there is not data confrontation with immune response predictions, thus, new measurements on cytokines and T cell responses would uncover new information.

The race to develop the first vaccine to tackle COVID-19 has started with the first clinical trial just 60 days after the genetic sequence of the virus. Modelling work developed in this paper paves the way for future mathematical models of COVID-19 to reveal prophylactic and therapeutic interventions at multi-scale levels [52–57]. Further insights into immunology and pathogenesis of COVID-19 will help to improve the outcome of this and future pandemics.

## MATERIAL AND METHODS

### Mathematical models

Mathematical models based on Ordinary Differential Equations (ODEs) are solved using the MATLAB library *ode*45, which is considered for solving non-stiff differential equations [58].

### Viral Kinetic Data of Patients Infected with COVID-19

The clinical data of 9 individuals is from [21]. Due to close contact with index cases and initial diagnostic test before admission, patients were hospitalized in Munich [21]. Viral load kinetics were reported in copies/ml per whole swab for 9 individual cases. All samples were taken about 2 to 4 days post symptoms. Further details can be found in [21].

### Parameter Estimation

Due to the viral load is measured in Log10 scales, parameter fitting is performed minimizing the root mean square (RMS) difference on Log10 scales between the model predictive output , and the experimental measurement (*y*_{i}):
where *n* is the number of measurements. The minimization of RMS is performed using the Differential Evolution (DE) algorithm [22]. Note that several optimization solvers were considered, including both deterministic (*fmincon* Matlab routine) and stochastic (*e*.*g* Genetic and Annealing algorithm) methods. Simulation results revealed that the DE global optimization algorithm is robust to initial guesses of parameters than other mentioned methods.

### Model Selection by AIC

The Akaike information criterion (AIC) is used here to compare the goodness-of-fit for models that evaluate different hypotheses [59]. A lower AIC value means that a given model describes the data better than other models with higher AIC values. Small differences in AIC scores (*e*.*g. <*2) are not significant [59]. When a small number of data points, the corrected (AICc) writes as follows:
where *N* is the number of data points, *M* is the number of unknown parameters and *RSS* is the residual sum of squares obtained from the fitting routine.

## Data Availability

The origin of the data is clearly stated in the manuscript.

## Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Author Contributions

EAHV envisaged the project and performed the simulations. All the authors discussed and wrote the paper.

## Acknowledgements

This research was funded by the Universidad Nacional Autonoma de Mexico (UNAM), CONACYT, and the Alfons und Gertrud Kassel-Stiftung.