Fractional SIR Epidemiological Models ===================================== * Amirhossein Taghvaei * Tryphon T. Georgiou * Larry Norton * Allen Tannenbaum ## Abstract The purpose of this work is to make a case for epidemiological models with fractional exponent in the contribution of sub-populations to the transmission rate. More specifically, we question the standard assumption in the literature on epidemiological models, where the transmission rate dictating propagation of infections is taken to be proportional to the product between the infected and susceptible sub-populations; a model that relies on strong mixing between the two groups and widespread contact between members of the groups. We content, that contact between infected and susceptible individuals, especially during the early phases of an epidemic, takes place over a (possibly diffused) boundary between the respective sub-populations. As a result, the rate of transmission depends on the product of fractional powers instead. The intuition relies on the fact that infection grows in geographically concentrated cells, in contrast to the standard product model that relies on complete mixing of the susceptible to infected sub-populations. We validate the hypothesis of fractional exponents i) by numerical simulation for disease propagation in graphs imposing a local structure to allowed disease transmissions and ii) by fitting the model to a COVID-19 data set provided by John Hopkins University (JHUCSSE) for the period Jan-31-20 to Mar-24-20, for the countries of Italy, Germany, Iran, and France. Keywords * SIR epidemiological model ## I. Introduction The classical SIR (Susceptible, Infectious, Recovered) model of infectious disease dynamics, and all subsequent multi-compartmental derivative models, are based on a model for the transmission rate that is taken universally in the form ![Formula][1] where *I*(*t*), *S*(*t*) represent the size of infected and susceptible sub-populations; the proportionality factor *β* is typically determined on a case-by-case basis. Thus, if *R*(*t*) represents the size of the recovered population, assuming that all individuals undergo full recovery and thereby the total population *S*(*t*) + *I*(*t*) + *R*(*t*) remains constant, the most basic model for transmissions is in the form of the following system of equations, known as ***SIR model***, ![Formula][2] where *α* is the recovery rate (with time constant of recovery *τ* := 1*/α*) and *η* a parameter regulating the rate at which immunity is lost over time. See [6] for all the details about the SIR model together with an extensive list of references. Multi-compartmental models that include infected but asymptomatic individuals, deceased, etc. as well as a flux from the recovered to the susceptible sub-population, as immunity wears out, have also been considered. However, throughout, the basic feedback that drives the infection, *r*(*t*), is invariably as in (1). In departure from this well-studied SIR paradigm, we propose a *fractional SIR (****fSIR****) model* with rate ![Formula][3] where one or, possibly, both sub-populations are scaled by exponents that are typically less than 1. The justification for such a model stems from the fact that, at least during the initial phase of an epidemic, infection propagates outwards from infected cells to the general population. In such a scenario, where for instance *S*(*t*) » *I*(*t*) (much greater), the boundary of infected cells which would roughly account for most new infections, scales as a fractional power *γ <* 1 of the area of the cells, hence *I*(*t*)*γ*. In actuality, due to the diffusive nature of infection-propagation amongst the general population, the exponent is expected to be larger than 1*/*2, as it would be in the continuous limit when the boundary is a smooth curve. Moreover, at least in the early phases of an epidemic, the exponent of *S*(*t*), which is significantly larger than *I*(*t*) may turn out to be negligible. The idea of using fractional exponents in growth models has been motivated from Norton-Simons-Massagué (NSM) model, a growth model of the form ![Formula][4] with origins in the 1950s [2]. This type of model was designed to describe the growth of biological organisms employing certain energy principles. In the model, the parameters *a* and *b* quantify anabolism (growth) and catabolism (death), respectively. Equation (3) may be interpreted as asserting that the net growth rate of an organism results from the balance of synthetic and degradative mechanisms. While the rate of the former process follows a law of allometry (i.e., the rate is proportional to the volume *V* (*t*) via a power function), the rate of the latter process scales linearly with *V* (*t*). It is important to note that the two special cases of (3), (i) power law *b* = 0, and (ii) second-type growth *λ* = 2*/*3 have already been successfully applied to describe tumor growth [9], [4]. The general case, 0 *< λ <* 1, was introduced in [8] to explain the self-seeding hypothesis. Moreover, an important geometrical interpretation was provided in [7], [1]. In these works, the authors relate the exponent *λ* = *d/*3 to the fractional Hausdorff dimension of the proliferative tissue, where *d* denotes the fractal dimension of the tissue. Moreover, the model (3) has been derived mechanistically by linking tumor growth to metabolic rate and vascularization [5]. In a similar spirit to the Norton-Simons-Massagué (NSM) model, herein, we recognize the geometric constraints imposed on disease propagation by the locality of transmissions around infectious cells. To this end, we seek to explain the origin of fractional power in (2) by i) numerical experiments, and ii) fitting such models to data sets. Specifically, with respect to i), we postulate a discrete model where infection propagates over nodes of a network. The network, representing individuals, is not planar (in general), yet it is immersed in ℝ2. While (physically) neighboring nodes may be densely connected, precluding the graph from being planar, the likelihood of being connected as well as infecting each other is regulated by a parameter that is a function of their physical distance. It is observed, universally, that such models lead to fractional exponents in the transmission rate, in agreement with (2). With regard to ii), we have also numerically studied data on recent COVID-19 epidemic that is available at [3]. For this dataset, as we note in the results, the exponent *γ* in (2) ranges from about 0.6 to 0.8, which is similar to the empirically determined exponent of the NSM model. ## II. Models of discrete transmissions To provide insight and justification for our hypothesis on the validity of equation (2), we develop a discrete model for direct transmissions between individuals consists of nodes (individuals) on a graph that captures contacts between them. In the present work, the graph is fixed, while in future work we plan to explore the possibility of time-varying links between nodes as well as the effect of control actions, such as social-distancing, so as to study the effects of such mediation-protocols. ### A. Model: probabilistic SIR on a graph Consider a simple graph of size *n* with adjacency matrix *A*. The graph is used to model the spread of infection over a network of nodes representing individual people. Every node can be in one the three states {*S, I, R*}. We use *x**i*(*t*) *∈*{*S, I, R*} to represent the state of node *i* at time *t*. Here, *x**i*(*t*) evolves, as a Markov chain on 3*n* states, according to the following transition probabilities at time *t* dictating transition at the node level, ![Formula][5] Here, *β* is the infection rate, *α* is the recovery rate, and *η* is the susceptible rate (quantifying loss of immunity over time). The notation ![Graphic][6] stands for the number of the neighbors of node *i* that are infected at time *t*, i.e. ![Formula][7] where 𝟙*{·}* = 1 when *{·}* holds and is 0 otherwise. We consider two types of initialization: 1. a localized, where a specified initial collection of neighboring nodes are infected while the rest of the nodes are susceptible, and 2. a randomized, where a randomly distributed collection of nodes are initially infected. Thus, in the latter, a large number of small initial cells of infected individuals are sprinkled randomly inside the general population, whereas in the former initialization, infection propagates outwards from a few initial cells. #### 1) Two-dimensional grid-graph We work out two rudimentary models where individuals (nodes) are placed on a 2-dimensional grid (vertex set) ![Formula][8] We carry out experiments for two cases, where the edge set is defined by ![Formula][9] with *d ∈ {*1, 2*}*. In either case, the infectious model is simulated and the results discussed in the section on experiments. #### 2) Two-dimensional random graph We postulate a distribution of nodes on ℝ2 according to a Gaussian-mixture model (Figure 1). Each node is connected to its 4 nearest neighbor. Analogous conclusions are drawn and discussed in the section on experiments as well. ![Fig. 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/30/2020.04.28.20083865/F1.medium.gif) [Fig. 1:](http://medrxiv.org/content/early/2020/04/30/2020.04.28.20083865/F1) Fig. 1: Random nodes on ℝ2 following a Gaussian Mixture Model (GMM) distribution. Nodes are connected to their 4 nearest neighbors. ## III. Experiments ### A. Two-dimensional grid-graph Simulation results of the infection spread model on the two-dimensional grid-graph with size 100 *×* 100, for *d* = 1, 2, are presented in Figure 2 and Figure 3, respectively. The population of susceptible, infected, and recovered people are calculated as follows: ![Formula][10] for each *ξ ∈*{*S, I, R*}. The top panel in each figure depicts the number of susceptible, infected, and recovered individuals as a function of time. ![Fig. 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/30/2020.04.28.20083865/F2.medium.gif) [Fig. 2:](http://medrxiv.org/content/early/2020/04/30/2020.04.28.20083865/F2) Fig. 2: Simulation result for spread of infection on a 2d-grid, with *d* = 1 ![Fig. 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/30/2020.04.28.20083865/F3.medium.gif) [Fig. 3:](http://medrxiv.org/content/early/2020/04/30/2020.04.28.20083865/F3) Fig. 3: Simulation result for spread of infection on a 2d-grid with *d* = 2 The relationship between the number of newly infected population at time *t*, denoted by Δ*I*(*t*), as a function of number of infected people *I*(*t*) is depicted in the lower panel. To be precise, Δ*I*(*t*) is defined according to ![Formula][11] Note that Δ*I*(*t*) only includes the newly infected people. In order to capture the relationship between Δ*I*(*t*) and *I*(*t*) and *S*(*t*), a parametric curve of the form Δ*I*(*t*) = *cI*(*t*)*γ* *S*(*t*)*κ*, is fitted to the data-points obtained from the simulation. The constant *c*, and the exponents *γ* and *κ* are obtained from a least-squares method applied to linear relation between respective logarithms1, ![Formula][12] The experiments are carried out for the following combination of settings: * initialization: local or random * *α* = 0.1 or *α* = 0.05 * *η* = 0.1 or *η* = 0.01 * *β* = 0.3 or *β* = 0.2. ### B. Two-dimensional random graph Simulation result on the two-dimensional random graph model is presented in Figure 4. The nodes are 104 samples from a mixture of three Gaussians ![Formula][13] ![Fig. 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/30/2020.04.28.20083865/F4.medium.gif) [Fig. 4:](http://medrxiv.org/content/early/2020/04/30/2020.04.28.20083865/F4) Fig. 4: Simulation result for spread of infection on a 2d random graph. The nodes are connected to the 4 nearest neighbors. The infection spread model is similar to the 2d-grid graph example. ### C. COVID-19 data-set We utilized data provided by the Johns Hopkins University Center for Systems Science and Engineering [3]. We study the relationship between the number of newly infected Δ*I*(*t*) and the total recorded number of infected *I*(*t*) individuals for the duration 01/31 to 03/24, in Italy, Germany, Iran, and France. The number of newly infected individuals is approximated by subtracting the number of infected people at time *t* from the number of infected people at time *t* − 1, as information on those recovering is not available. We fitted the curve Δ*I*(*t*) = *cI*(*t*)*γ* to the data; due to the fact that *S*(*t*) *» I*(*t*) during these initial stages of infection spread *S*(*t*) is treated as constant. The result for four different countries is depicted in Figure 5. ![Fig. 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/04/30/2020.04.28.20083865/F5.medium.gif) [Fig. 5:](http://medrxiv.org/content/early/2020/04/30/2020.04.28.20083865/F5) Fig. 5: Study of the relationship between infection growth Δ*I*(*t*) and the number of infected people *I*(*t*) for four different countries. ## IV. Discussion The main thesis of this work is that models of epidemics, especially at early phases, incorrectly assume that the contagion depends on the product of infected and susceptible populations. Contagion takes place at the boundary of infected cells and as a result it is the topology of the distribution of infected cells that dictate the spread. Thus, we propose a fractional-power alternative to the standard SIR model. Our thesis appears to be supported by simulation results as well as by fitting this model to recent COVID-19 datasets. Specifically, the two-dimensional discrete probabilistic SIR models in Figure 2 (with a nearest neighbor connection) and in Figure 3 (with a two-step nearest neighbor connection), suggest exponent of 0.50 − 0.77 for the contribution of *I*(*t*) on the infection rate. It should be noted that, although these plots depict Δ*I* vs. *I*(*t*), whereas the third abscissa for *S*(*t*) is not shown. Similar results are observed in Figure 4 for a two-dimensional random distribution of nodes (vertex set) with four nearest-neighbor contacts (edge set). Here, the exponent of the *I*(*t*) contribution to the infection rate lies in a similar range (*{*0.58, 0.64*}* for the conditions displayed). The fit of the COVID-19 data-set gives exponents for the contribution of *I*(*t*) on the infection rate in the range of 0.6 − 5 0.8. In this data-set, the value of *S*(*t*) (that includes the remaining of a rather large total population) varies insignificantly over time, and hence treated as a constant. Several limitations of our experiment are noted. Firstly, the value of *I*(*t*) is only an estimated value since recording of all infected individuals is not guaranteed. Secondly, the value of Δ*I*(*t*) is estimated as being the difference *I*(*t*) − *I*(*t* − 1). Thus, we do not take into account individuals who may have recovered. However, it is deemed that the uncertainty in the actual value of *R*(*t*) is not significant as, from available information in the rapidly developing COVID-19 epidemic, it appears that the recovery rate is of the order of weeks. It is imperative that a deeper and more extensive study is carried out, whereupon the values of *I*(*t*), Δ*I*(*t*), *R*(*t*) are estimated from more extensive datasets. The effect of mediation efforts, such as social distancing, should be recorded as well and taken into account by differentiating data for the periods before and after such mediation protocols take effect. It is our hope that question raised in this work, as to the validity of the basic assumption in SIR models, leads to more reliable and robust ways to estimate the progression of epidemics. ## Data Availability Data used are publically available. [https://github.com/CSSEGISandData/COVID-19](https://github.com/CSSEGISandData/COVID-19) ## Acknowledgments This research was supported by AFOSR Grant FA9550-20-1-0029, National Institute of Aging Grant R01-AG048769, MSK Cancer Center Support Grant/Core Grant (P30 CA008748), and a grant from Breast Cancer Research Foundation BCRF-17-193. ## Footnotes * † Joint senior authors. * {ataghvae{at}uci.edu} * nortonl{at}mskcc.org * allen.tannenbaum{at}stonybrook.edu * 1 the zero entries for Δ*I*(*t*), *I*(*t*), *S*(*t*) are ignored * Received April 28, 2020. * Revision received April 28, 2020. * Accepted April 30, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. [1]. S. Benzekry, C. Lamont, A. Beheshti, A. Tracz, J M.L. Ebos, L. Hlatky, and P. Hahnfeldt. Classical mathematical models for description and prediction of experimental tumor growth. PLoS Computational Biology, 10(8):e1003800, 2014. 2. [2]. L. Von Bertalanffy. Quantitative laws in metabolism and growth. The Quarterly Review of Biology, 32(3):217–231, 1957. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1086/401873&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=13485376&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F30%2F2020.04.28.20083865.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1957WM87500001&link_type=ISI) 3. [3].Johns Hopkins University Center for Systems Science and Engineering. 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository. [https://github.com/CSSEGISandData/COVID-19](https://github.com/CSSEGISandData/COVID-19), 2020. [Online; accessed 25-April-2020]. 4. [4]. P. Gerlee. The model of muddle: in search of tumor growth laws. Cancer Research, 73:2407–2411, 2013. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FucmVzIjtzOjU6InJlc2lkIjtzOjk6IjczLzgvMjQwNyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIwLzA0LzMwLzIwMjAuMDQuMjguMjAwODM4NjUuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 5. [5]. A. B. Herman, V. M. Savage, and G. B. West. A quantitative theory of solid tumor growth, metabolic rate and vascularization. PLoS ONE, 6(9):e22973, 2011. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0022973&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21980335&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F30%2F2020.04.28.20083865.atom) 6. [6]. Maia Martcheva. An Introduction to Mathematical Epidemiology, volume 61. Springer, 2015. 7. [7]. L. Norton. Conceptual and practical implications of breast tissue geometry: Toward a more effective, less toxic therapy. The Oncologist, 10:370–381, 2005. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTM6InRoZW9uY29sb2dpc3QiO3M6NToicmVzaWQiO3M6ODoiMTAvNi8zNzAiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wNC8zMC8yMDIwLjA0LjI4LjIwMDgzODY1LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 8. [8]. L. Norton and J. Massagué. Is cancer a disease of self-seeding? Nature Medicine, 12(8):875–878, 2006. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nm0806-875&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16892025&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F30%2F2020.04.28.20083865.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000239626200011&link_type=ISI) 9. [9]. V. G. Vaidya and F. J. Alexandro. Evaluation of some mathematical models for tumor growth. International Journal of Bio-Medical Computing, 13:19–35, 1982. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/0020-7101(82)90048-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=7061168&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F04%2F30%2F2020.04.28.20083865.atom) [1]: /embed/graphic-1.gif [2]: /embed/graphic-2.gif [3]: /embed/graphic-3.gif [4]: /embed/graphic-4.gif [5]: /embed/graphic-5.gif [6]: /embed/inline-graphic-1.gif [7]: /embed/graphic-6.gif [8]: /embed/graphic-7.gif [9]: /embed/graphic-8.gif [10]: /embed/graphic-10.gif [11]: /embed/graphic-13.gif [12]: /embed/graphic-14.gif [13]: /embed/graphic-15.gif