Estimating infection-related human mobility networks based on time series data of COVID-19 infection in Japan

Comprehensive and evidence-based countermeasures against emerging infectious diseases have become increasingly important in recent years. COVID-19 and many other infectious diseases are spread by human movement and contact, but complex transportation networks in 21 century make it difficult to predict disease spread in rapidly changing situations. It is especially challenging to estimate the network of infection transmission in the countries that the traffic and human movement data infrastructure is not yet developed. In this study, we devised a method to estimate the network of transmission of COVID-19 from the time series data of its infection and applied it to determine its spread across areas in Japan. We incorporated the effects of soft lockdowns, such as the declaration of a state of emergency, and changes in the infection network due to government-sponsored travel promotion, and predicted the spread of infection using the Tokyo Olympics as a model. The models used in this study are available online, and our data-driven infection network models are scalable, whether it be at the level of a city, town, country, or continent, and applicable anywhere in the world, as long as the time-series data of infections per region is available. These estimations of effective distance and the depiction of infectious disease networks based on actual infection data are expected to be useful in devising data-driven countermeasures against emerging infectious diseases worldwide.


Introduction
imply that the deviations from the predicted values attributed to the local, or regional factors other than travels 117 from Tokyo, such as within-prefecture spread or containment of the coronavirus cases or travels from nearby 118 prefectures. Therefore, our findings revealed that the effective distance based on the changes in evolution of 119 infection over time in each prefecture putatively reflected both human mobility and other local effects from 120 nearby prefectures during the pandemic.

121
Effective distance changes dynamically depended on stages of the pandemic.

122
Another advantage of the effective distance derived from time series data of infections over that derived from 123 passenger volume data is that human mobility can be estimated dynamically without the need for extensive 124 surveys. Such dynamic estimations based on different stages of the pandemic are particularly important to 125 evaluate effects of specific policies (e.g., "soft lockdown" and travel campaigns) and to define a better strategy. P1a, P2a, and P3a (Fig. 4a, Supplementary Figure 6a). Regarding the distribution of the effective distance, P2, during which the number of infected people declined without the need for any strict movement restrictions, 131 our model showed that the largest mean and effective distance of each prefecture distributed widely around 132 the mean (Fig. 4a). Conversely, P1 exhibited the smallest mean and deviation from the mean, whereas P3 133 exhibited an intermediate mean, but the distribution was not unimodal, indicating the existence of distinct 134 trends in human mobility between prefectures (Fig. 4a). Considering that the Japanese government declared 135 a state of emergency during P1 and P3 to contain the pandemic, the distribution of the effective distance of 136 P2 would be considered desirable to limit the diffusion of coronavirus pandemic while maintaining passenger 137 mobility. Furthermore, during P3, the effective distance in some prefectures became smaller while others did 138 not, presumably reflecting the increase in travelers from Tokyo to other prefectures due to the second stage travel 139 campaign in which travels to and from Tokyo became subject to the subsidization. These results demonstrated 140 that infection-related human mobility could be captured during a pandemic. Subsequently, we investigated the 141 changes in the effective distance for each prefecture by region. The general trend of the effective distance across 142 the entire country was conserved over time (Fig. 4b and Supplementary Figure 6b), which indicated that, other 143 than diffusion from Tokyo, there was a local connectivity between nearby prefectures that contributed to the 144 spreading of the virus and altering their distance from Tokyo. Furthermore, considering the combination with 145 other plots (Fig. 4a, Supplementary Figure 5a,b), it can be implied that closer effective distances, in particular 146 like those found in major cities in each region, chiefly contributes to the rapid increase in coronavirus cases 147 against which the government needed to impose restrictions. Consequently, the effective distance estimated 148 from the time series of the newly confirmed cases evolved temporally, which might provide useful information 149 for determining the appropriate mobility level during the pandemic.

150
A distorted map of Japan, based on effective distances and local interactions, re-151 vealed the non-uniform spread of the pandemic across Japan.

152
To visualize how the pandemic spread across Japan across different periods, we combined information on effective 153 distances and local interactions between prefectures, which we defined as the geographical distance between 154 adjacent prefectures that had large passenger traffic in the 2019 survey, to create a distorted map of Japan.

155
The position (i.e., longitude and latitude) of a capital city in each prefecture was determined as the solution 156 of a mathematical optimization equation using the non-linear least squares calculation [38]. In general, regions 157 to the west of Tokyo, largely the Kinki and Kyusyu regions, reduced in size, whereas prefectures around Tokyo 158 were expanded (for example, the Chubu regions were stretched in the north-west direction) in the distorted 159 maps compared to the original one ( Fig. 5a-d). These deformations indicated that the coronavirus tended 160 to rapidly spread to the western regions of Japan, while it required additional time towards spread to the 161 north and north-western directions, particularly towards the Tohoku and Chubu regions. Thus, distorting maps 162 based on effective distance and local interactions would be useful to characterize the spread of the coronavirus 163 geographically and to develop specific policies aimed at preventing a large spike in the infectious population. Although effective distance allows us to estimate how the virus spreads across Japan, it is still unclear how 167 much the effective distance contributes to determine the scale of the pandemic quantitatively, which would 168 be a useful parameter when developing policies that balance both economic activity and medical resources in 169 each prefecture. To evaluate the effects of the effective distance, we conducted simulations in which only the 170 effective distance was changed from P3. If the effective distance was the same as P1, the total number of 171 infected people would increase to more than twice in peripheral or rural areas, but would not increase to a 172 similar degree in populated prefectures (Fig. 6a). Conversely, if the effective distance was the same as P2, 173 there would be no clear differences between the total number of infected individuals between the central or 174 peripheral prefectures in Japan (Fig. 6a). These results indicated that the effective distance could largely affect 175 the scale of the pandemic, and these effects were heterogeneous between prefectures. Finally, we considered 176 the increase in human mobility that could result from a very large public event such as the 2020 Olympic 177 and the Paralympic Games, which were to be held in Tokyo and several other prefectures, and estimated the 178 impact on the pandemic (Supplementary Figure 9). Our findings showed that prefectures that held several 179 matches, particularly prefectures that usually did not share strong connectivity with Tokyo, would experience 180 a large increase in the number of patients (Fig. 6b). For example, the number of infected cases in Fukushima 181 prefecture would increase by 80%, while other host prefectures would increase by 40 to 60%. This indicated 182 that the impact of an increase in human mobility due to huge public events were more severe in prefectures that 183 usually had a larger effective distance from Tokyo. Based on these quantitative estimations, we recommend that 184 policymakers should correctly control for the effective distance or distribute medical resources across the whole 185 country. Therefore, effective distance enables us to estimate the scale of the pandemic in a single prefecture 186 . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint based on changes human mobility and would be helpful in devising countermeasures against COVID-19 and 187 other infectious diseases over changing situations.

189
In this study, we calculated the effective distance of the spread of COVID-19 in Japan by applying the SEIR 190 model to Tokyo, the capital city of Japan. Despite the presence of social factors that altered human mobility, 191 such as two declarations of the state of emergency and travel campaigns, the spread of COVID-19 was demon-192 strated mainly by assuming the spread from Tokyo during all phases of the pandemic (Supplementary Figure   193 1). More specifically, our SEIR model-based predictions revealed that the effective distance from Tokyo to other 194 prefectures changed over each phase, in which the effective distance between Tokyo and Hokkaido was close 195 for P1 and P3, but far for P2, for example (Supplementary Figure 1). Comparing the fitting parameter and 196 effective distance in P1a, P2a and P3a, we can see the effect of government-sponsored travel promotion. In par-197 ticular, the effective distance between Tokyo and prefectures with many tourist attractions has increased. These  venues. Since the mobility of people between Tokyo and each prefecture was predicted to be a critical factor for 203 pandemic control [24,25,26], the first priority for policy makers should be to prevent the infectious situation in 204 Tokyo from being transferred to local areas, and in this sense, the decision of the Japanese government to hold 205 the Olympic games without spectators is commendable. Conversely, although, spectators have been excluded 206 from participating in the events, athletes and media personnel remained a group of individuals moving around 207 the event venues, so it is important to adequately consider the control strategy.

208
Estimating the effective distance from actual infection data is also a useful method for constructing an

218
By combining the effective distance estimated from infection data with the effective distance based on a 219 transportation network and movement data, we believe that it is possible to establish more effective preventive 220 policies to avoid the spread of infection. It is difficult to quantitatively evaluate the contribution of various 221 modes of transportation, such as airplanes, cars, and trains, to infectious diseases. By comparing the two 222 effective distances, it is possible to determine the means of transportation that contribute to the transmission 223 of the disease and to take selective and effective preventive measures. However, the proposed method has some 224 limitations. Importantly, the assumption of the diffusion process from Tokyo is qualitatively different between 225 periods before and after peaks of infection and does not allow to quantitatively compare effective distances 226 between these periods. Thus, improvements in methodology to the estimate effective distance based on the 227 uniform assumption remain to be addressed in future studies.  The survey data on passenger traffic between prefectures in 2019 were from the Ministry of Land, Infrastruc-241 ture, Transport, and Tourism. The data included the passenger volumes on railways, cars, ships, and airlines.

242
The population data for each prefecture, the distance between prefectures, which is defined by the distance 243 between prefectural capitals, and the GeoPackage of for Japan were downloaded from the Statistics Bureau of 244 Japan, the Geospatial Information Authority in Japan, and Database of Global Administrative Areas (GADM), 245 respectively.

247
To describe the course of the COVID-19 pandemic in Tokyo, which is the capital and most populated city and 248 likely the epicenter of the pandemic in Japan, we constructed a basic SEIR model, that included the following 249 compartments: Susceptible (S), Exposed (E), Infectious (I), and Recovered (R). The choice of our model was 250 motivated by several previously published studies on the COVID-19 pandemic [30,35,37].

251
The local dynamics of transmission in the model was given by In the model, the susceptible population (S) becomes exposed to the viral agent upon contact with the infectious 253 population (I) and thus, becomes the exposed population (E). The transmission rate of the virus is expressed 254 as β, and βI represents the force of infection. The exposed population becomes an infectious population at the 255 rate (i.e., calculated as the inverse of the latent period), and the infectious population becomes the recovered 256 population (R) at the rate ρ, the or recovery rate.

257
Parameter estimation using the MCMC algorithm.

258
Model parameters were estimated using the Bayesian framework by sampling the posterior parameter distri-259 bution via an affine-invariant MCMC algorithm using the emcee v3 toolkit [42,43]. For data reliability, the 260 data used to calibrate the model consisted of the 7-day backward moving average of newly confirmed cases in 261 Tokyo. Considering the rapid increase in the spread of the coronavirus pandemic in Tokyo, we assumed that 262 the count data of newly confirmed cases followed a Gaussian distribution with a mean given by ξI and variance 263 given by 1/τ . E 0 and I 0 , the number of exposed and infectious individuals on the first day during the period, 264 respectively, were also estimated. For simplicity, the number of recovered individuals, or R 0 , was set at zero; 265 thus, the number of susceptible individuals, or S 0 , was calculated as the total population of Tokyo. Consequently, we estimated seven parameters for each period: β, , ρ, ξ, E 0 , I 0 , 267 and τ . We used the same uniform distributions for all periods as the prior distributions for each parameter, as  The spread of the COVID-19 pandemic in prefectures other than Tokyo was modeled as the diffusion process 276 of infected individuals from Tokyo. We denoted the number of newly confirmed cases as u(x, t), as follows: where D was the diffusion coefficient. The parameters used in the simulation are summarized in [TABLE]. For 278 simplicity, the initial conditions for the PDE were expressed as:

279
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; u(0, 0) = ξI 0 u(x, 0) = 0, x ∈ (0, L], which meant that the infectious population was located only in Tokyo in the initial condition. For the boundary  prefectures was adjusted by subtracting x T for subsequent analyses.

294
Effective distance based on the inter-prefecture network of traffic and mobility data 295 of individuals.

296
Since the contagion process is predominantly caused by the movement of people, we can also estimate the 297 effective distance between prefectures based on passenger traffic data. The effective distance derived by this 298 method was used to test the validity of the effective distance derived from the diffusion equation and for its 299 characterization ( Fig. 3a,b, Supplementary Figure 3a). We defined the connectivity matrix P as follows: where F nm represented the inter-prefecture passenger traffic on railways, cars, ships and airlines. Thus, the 301 matrix P quantified the fraction of the passenger flux from prefecture n to m. Given this flux-fraction matrix 302 P, the effective distance d nm from prefecture n to m is defined as We then constructed a weighted directed graph network consisting of 47 nodes, representing the prefectures in 304 Japan, and edges from prefecture n to m with weights of d nm (i.e., 47 C 2 = 1081 edges in total). Consequently, 305 the effective distance from Tokyo to prefecture n, or D n , was estimated as the length of the shortest path from 306 Tokyo to the prefecture n in the graph network.

307
Linear regression of the effective distance based on the diffusion process by other 308 distance metrics.

309
The effective distance derived from the diffusion equations was predicted by a linear model without intercept, 310 or y = ax, as Tokyo is at the origin. Models were fit to minimize the residual sum of squares. Coefficient of 311 determination (R 2 ) was determined as R 2 = 1 − u v , where u is the residual sum of squares and v is the total 312 sum of squares.

313
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint Transformation of a map of Japan based on the effective distance from Tokyo and 314 local connectivity. 315 We created a transformed map of Japan according to the Mercator projection (i.e., the longitude and latitude 316 corresponding to the x and y coordinates in the Euclidean plane, respectively) depending on the effective dis-317 tance from Tokyo, which was estimated from the diffusion equation and local interactions between prefectures 318 (Supplementary Figure 7). Local interactions between prefectures were defined as the geographical distance 319 between adjacent prefectures, including those connected by roads. Furthermore, for simplicity, the local con-320 nectivity was only considered for adjacent prefectures that had substantial passenger traffic, or 100,000 people 321 per year in at least one direction. The unit of effective distance was converted to that of geographical distance 322 assuming a linear relationship between them locally, as shown in Supplementary Figure 3c,d. In total, 111 links 323 between prefectures (i.e., 46 links that expressed effective distance from Tokyo to other prefectures and 65 links 324 of local interactions) were used to transform a map of Japan to represent the spread of the pandemic.

325
Distorting a map based on a given proximity can be formulated as a nonlinear least squares problem as 326 follows: where ij denoted the links between points i and j, and x and y represented the longitude and latitude of a 328 prefecture, respectively. L denoted the set of 111 links between prefectures, and t ij denoted the proximity (i.e., 329 effective distance or local geographical distance). To solve this problem, we adopted the algorithm suggested by method, which uses bearings between points as the initial condition, and is a natural premise when transforming 333 a map. Next, the nonlinear least squares problem was rewritten as where θ ij was the approximate bearing of link ij in the transformed map.

335
When α = 1, we obtained: This meant that we could approximate the solution for the initial nonlinear least squares problem by an iteration 337 of solving two independent linear squares problems. When solving these linear squares problems, we assigned 338 different weights to equations derived from the effective distance and those from local interactions (10:1).

339
The weight was determined so that the distorted maps represented effective distance from Tokyo as well as 340 geographical relationships between nearby prefectures (Supplementary Figure 8a,b). These weights enabled the  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint  Prefectures whose effective distance is larger than 100 were excluded. b,Effective distance of each prefecture in periods P1a, P2a, and is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Supplementary Figure  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Supplementary Figure 2 | Diffusion process from Tokyo recapitulates time series data of infections in other prefectures during period P1a.
Cumulative number of newly confirmed cases that were observed (blue) and estimated by the diffusion process from Tokyo (orange) in 545 each prefecture during P1a.

546
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Supplementary Figure   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Supplementary Figure 4 | Residuals of the effective distance based on the diffusion process.
Residuals of the effective distance derived from the diffusion process compared to the predicted values based on linear regressions based 551 on effective distance derived from the passenger traffic data (Fig. 3a, Supplementary Figure 5a).

552
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; Prefectures whose effective distance was larger than 100 were excluded. b,Effective distance of each prefecture in P1b, P2b, and P3b, 555 ordered by "Prefecture Code" defined International Organization for Standardization. Regions in which prefectures belong to are colored 556 differently.

557
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint Supplementary Figure 8 | Distorted maps of Japan based on effective distance in different weight ratio between effective distance and local interactions.
A map of Japan was distorted to represent the effective distance in each period. Distorted maps were determined as solutions for weighted 560 non-linear least squares problems. Various weight ratio between the effective distance and the geographical distances between nearby 561 prefectures which had high passenger volumes was tested (a, b, Fig. 5).

562
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261486 doi: medRxiv preprint Supplementary Figure 9 | Putative map of Japan during the Tokyo 2020 Olympic used for the simulation in Fig. 6b.
Distorted map based on effective distance during period P3a (left) and modified period P3a (P3'a: Olympic) (right). In period P3'a, the 563 prefecture in which some games will be held, highlighted in the left panel, approached Tokyo more than others.

564
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Seven parameters were estimated using the MCMC algorithm so that the SEIR model represented the time series data of infections in 565 each period. The median of the sampler of each parameter was chosen as a representative parameter.

566
Supplementary Table 2 | Parameters in the diffusion equations.
The parameters for the diffusion equations were chosen so that the solutions represented the time course of the number of newly confirmed 567 cases in each prefecture.

568
Supplementary Table 3 | Effective distance in each prefecture in the respective periods.
The effective distance of each prefecture from Tokyo was estimated in each period based on the diffusion process.

569
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ;