Abstract
In this article, a class of mean-field-type games with discrete-continuous state spaces is considered. We establish Bellman systems which provide sufficiency conditions for mean-field-type equilibria in state-and-mean-field-type feedback form. We then derive unnormalized master adjoint systems (MASS). The methodology is shown to be flexible enough to capture multi-class interaction in epidemic propagation in which multiple authorities are risk-aware atomic decision-makers and individuals are risk-aware non-atomic decision-makers. Based on MASS, we present a data-driven modelling and analytics for mitigating Coronavirus Disease 2019 (COVID-19). The model integrates untested cases, age-structure, decision-making, gender, pre-existing health conditions, location, testing capacity, hospital capacity, mobility map on local areas, in-city, inter-cities, and international. It shown that the data-driven model can capture most of the reported data on COVID-19 on confirmed cases, deaths, recovered, number of testing and number of active cases in 66+ countries. The model also reports non-Gaussianity and non-exponential properties in 15+ countries.
1 Introduction
COronaVIrus Disease 2019 (COVID-19) is one of the severe acute respiratory syndrome-related coronavirus. COVID-19 is a disease caused by the virus SARS-CoV-2. SARS-CoV-2 is infecting people across several countries within few months and causing deaths, economic losses and food insecurity. As COVID-19 spreads around the globe, authorities are making decisions about local migration, confinement, isolation, quarantine and heath care system capabilities. Individuals, depending on the country, culture, social-economic context, age and season, are also making decisions concerning COVID-19. In this paper we propose an integrated mathematical model and analysis strategies for combatting the propagation of COVID-19 and for examining the economic consequences. For each infected person, one often defines the number of secondary infections that are generated by that patient. The probability distribution of that latter random variables is referred to as offspring distribution. The offspring distribution depends on its infectiousness which can be biological, environmental, behavioral or combination of these. The offspring distribution has been estimated from empirical data for outbreaks of several infectious diseases such as tuberculosis, cholera, plaque, smallpox, severe acute respiratory syndromes (SARS), MERS and Ebola. There are two important sectors that authorities are considering to mitigate COVID-19. The public health sector which tries to understand the dynamics of COVID-19 (via epidemiological models and efforts) and take measure to flatten the curve of active COVID-19 patients (mitigation and suppression). The economic policy sector has been used to alleviate negative consequences implied by COVID-19 and by the restrictive policy measures. The economic policies are also being used to prevent the amplification and the multiplication of the crisis. Unfortunately, these two sectors and approaches have not been coordinated enough. The effectiveness of the public health measures may depend on the perception of people in practice and their expectations because each individual also make its own decisions. Therefore both decisions, from economic policies by authorities and from country-specific individual preferences, should be integrated all together.
Literature review on SARS-CoV-2
There is an important ongoing research on COVID-19. Below we review some closely related works. For SARS-CoV-2 (and hence COVID-19) the offspring distribution is not known and the dynamic COVID-19 data set which is an information that is updated on a daily basis or so, meaning it changes over time as new information becomes available.
Effect of mobility on the propagation COVID-19: The movement and migration of people between locations add an additional difficulty in estimating the infectivity as each infected person may be moved, flown to another city, state, country or continent. Some estimations have been proposed by using contact-tracing or mobile tracking of the path of the each infected person and all its possible contacts using mobile networks and recognition. However, these estimations on COVID-19 remain limited at a global level [1–4]. The work [5] focuses on a Global Epidemic and Mobility model that integrates sociodemographic and population mobility data set in a spatially structured stochastic disease approach to simulate the spread of epidemics at the worldwide scale. In contrast to [5], we integrate a data-driven optimization setup for authorities and for individuals as well.
Age and gender structure on COVID-19 death rate: Because the COVID-19 reported deaths depend on ages and gender, the role of demography on COVID-19 spread and fatality is examined in [6]. The work in [7] explains the use of SARS-CoV-2 RT-PCR testing outcomes in Italy. The work in [8] provides some early estimates of clinical severity of COVID-19 by age.
Epidemiological data set is needed during epidemics to best monitor and anticipate spread of infection. The work in [9] provides geo-positioned records of case of of COVID-19 outbreak. COVID-19-related deaths are not clearly defined in the international reports available so far, and differences in definitions of what is or is not a COVID-19-related death might explain some of the variations in the reported data.
Several basic compartmental models have been suggested in the literature to simplify the complexity of mathematical modelling of infectious diseases. These include
SIR (susceptible-infected-removed),
SEIR, that is extended SIR with exposed (E) agents
SEIR with quarantine (Q)
SEIR with age-structure
SEIR with spatial location
SEIRD that is R is divided into recovered (R) or deceased (D)
SEIRD with data fitting via parameters.
These models need to be adapted to include important and key features of COVID-19 such as context-awareness, decision-making, co-morbidities per area, mobility, age, gender, healthcare status, testing capabilities to capture the evolving COVID-19 data set [10–14].
Table 1 displays some of the key differences and key features of the data-driven MFTG model.
Data-driven mean-field-type game theory
The term ‘mean-field’ has been referred to a physics concept that attempts to describe the effect of an infinite number of particles on the motion of a single particle. Researchers began to apply the concept to social sciences in the early 1960s to study how an infinite number of factors affect individual decisions. However, the key ingredient in a game-theoretic context is the influence of the distribution of states and or control actions into the payoffs of the decision-makers. There is no need to have a large population of decision-makers. A mean-field-type game (MFTG) is a game in which the (instantaneous) payoffs and/or the state dynamics coefficient functions involve not only the state and actions profiles but also the distributions of state-action process (or its marginal distributions).
In a mean-field-type game, the quantities-of-interest (such as instantaneous payoff, cost, performance functionals, state kernel/coefficients) depend not only on the type-state-actions of the decision-makers but also on the distribution of them. This allows us to consider higher order (non-linear) dependence in the distribution of state and or actions on the expected utility/cost functions. It also allows us to capture risk-awareness in the performance metric whenever one or more decision-makers are facing uncertainties. We refer the reader to [15–25] for more details on MFTG.
We build upon the MFTG model a data-driven methodology. The data-driven mean-field-type game theory (MFTG) model of the pandemic COVID-19 presented here is built upon Markov decision process model of epidemics. Despite the simplicity of this model, important features are extracted that can assist potential decisions on the strategy to combat the outbreak, essentially configuring a scenario ranging from short-term suppression to long-term mitigation depending on the achieved reduction in the contact number, exposed number, testing number, pre-existing health conditions, age, gender, location, mobility pattern and spatial contact rate. A coopetition and dedicated effort from individuals, governments, decision-makers and stakeholders will help in reducing the propagation but with a global high cost despite the increased rate of recovered people. Basically, data-driven MFTG is given as follows.
Time is continuous
Data is reported at discrete-time instants.
Decision-makers: individuals, firms and authorities
Control Actions: Individuals’ actions include contact/meeting rates and a movement path to go outside for food, pharmacies or other basic needs. Firms’ action include the production essential, moderate-essential and less-essential goods and total working hours. An authority at specific location controls the testing actions and migration rates within that area, between different areas of a city or between cities or the country level.
Objective: an individual aims to reduce its risk of being infected and to limit its economic consequences. Each firm aims to maximize its revenue while keeping its employees from infections. Each authority aims to reduce the number of deaths under cost-benefit considerations (see Figure 1).
Contributions
The proposed data-driven MFTG model has the following key features
context-awareness: captures some areas/city/country-specific features
integrated decision-making and risk-aware cost-benefit analysis for authorities and individuals.
the mobility map/pattern per areas/cities/countries is endogenous to the model
the country-specific age-structure is endogenous to the model
the pre-existing health conditions per city/country is endogenous to the model
an estimation of the probability of being exposed (coughing, sneezing, surface contact)
the hospital capacity is endogenous to the model. It is an aggregated data from number of hospital beds, intensive care facilities, ventilators, nurses, doctors, point-of-care etc.
the testing capacity per country is endogenous to the model (the number of testing per week, number of testing facilities per areas/city/country)
the infection status of the untested individuals is endogenous to the model
The goal of the article is to examine the data-driven MFTG modelling aspect of COVID-19. From the mean-field-type game theory perspective, one examines carefully each individual process at the microscopic level. The individual characteristics are age, gender, location, and infection status. As each infected individual may affect its evolving neighbors and path, its create an evolving mean-field term at the local, regional, national and international level. Thus, a macroscopic description of the proportion of infected per location will emerge as a global mean-field term. In addition, risk-awareness at the individual level as well as the global level is proposed. As several countries may not have the same season, culture or socio-economic context, we introduce temperature field, seasonal variation and local specificities across the countries and couple them with the decisions on the duration of the soft/medium/hard confinement. The age-structured dynamics is introduced to capture the dependence on age and on the prior medical conditions of the COVID-19 patient. For the pandemic COVID-19, many local hospitals cannot sustain a massive influx of infection and this will create a congestion and saturation. Saturation and the local capacities of the healthcare system are introduced into the model and de-congestion and curve flattening strategies are analyzed. We also examine the impact of interventions, control and risk-aware mitigation policies from the authorities. Both early control, late control and no control scenarios are examined from a network perspective using a risk-aware mean-field-type game theory. At the individual level we formulate a risk-aware decision-making problem that the probability of being infected and its possible consequences. We build upon prior optimization-based multi-agent models developed in [26–30]. The number of deaths in the infected population is estimated in two ways. The number of deaths at hospitals and the number of deaths who are not tested. While there is a reported data set for the number of deaths at hospitals that directly and indirectly related to COVID-19. Currently, there is no clear data set for the number of untested deaths. We use a model to estimate it.
The COVID-19 pandemic has already an enormous impact on economic, finance, agriculture, energy, and geopolitics. The decisions made by the authorities such as confinement, lockdown, isolation, curfew measures, may have severe side effects and consequences on other sectors such as employment, supply chain, agriculture and food security. Therefore a cost-benefit analytics need to be conducted for each policy per area/city/country.
The risk-aware mean-field-type game techniques presented here can be used to better understand the propagation of virus across geographical areas. As an integrated model, the impact of control strategies and switching between different locations in the propagation of the virus include
international connectivity via flight, boat or ground transportation,
inter-cities connectivity via flights, trains, buses, cars, motorcycles,
within city connectivity with pedestrian mobility to pharmacies, local markets, open street air markets, supermarkets, workplaces, offices, and isolation/confinement strategies and testing strategy based on the spreading characteristics. For finite networks, the mean-field term is stochastic and characterizes the evolution of the total number of infected/recovered/tested/untested agents per location. By introducing control strategies at both atomic and non-atomic levels, the number of deaths can be reduced. However, the control is costly and need to be carefully planned (Figure 11). Table 2 summarizes the heterogenous data sources.
View this table:
Structure
The rest of the article is organized as follows. In Section 2 we present some basic models. The data-driven MFTG setup is presented in Section 3. Section 4 presents the COVID-19 data set used. Section 5 concludes the article.
2 Basic models
SIR We start with a basic system of differential equations to estimate the propagation in an homogenous population with size n = n(t0) ≫ 1 at time t0 and three compartments (Susceptible “S”, Infectious “I”, Removed “R”) There are three different variables (S, I and R) representing the status of each compartment. β represents the disease transmission rate and γ describes the “removed” rate.
An important parameter, called basic reproduction number, is given by:
2.1 Data-Driven SIR
We would like to have an idea of I(t0), β, γ, R0. We aim to estimate them from the dynamic data set that reported (publicly) by authorities.
Given the data sets
− infected cases Î (t0), Î (t0 + 1), …, Î (t0 + k − 1),
− Removed cases , propose an estimate of and
Compute an estimate for R0
We estimate the rate parameters by solving a dynamic optimization problem.
Given the data sets, solve the basic optimization problem
This provides
where Spec means the specification parameters and k ≥ 1.
This leads to a Data-Driven SIR:
From the latter system one can obtain the trajectories of (S, I, R) over time. Then, several questions arise
What if the data set changes regularly?
Weekly estimates?
Monthly estimates?
Long-term estimates?
It is important to notice that for some countries, the data set provided need to be structured per week, two weeks, month time windows or a proper moving average to learn a meaningful trend of the data.
Data-Driven SIR
Pros
− we have a data-driven model of SIR
− it is a simple model
− it is compartment-based
− it is based on the data set during the time window [t0, t0 + k − 1]
− it is deterministic
Contras:
− N is fixed and global
− not localized
− it is homogenous
− Does the shape of I, R capture the observed data
Next, we relax the restrictive assumptions made in the previous model and include context-awareness into the data-driven model.
2.2 Basic SIR with location, age and economic wealth
We introduce the SIRD (susceptible-infected-recovered-deceased) dynamics with spatial location x over a map 𝒟l at locality l, socio-economic status e and age-structure z. where u is the meeting rate, γ is the recovery rate, δ is the death rate. These are nonnegative real-valued.
From (1) we observe that the function is identically zero. We set ∫ (S + I + R + D)(dx, dz, de, t0) = n0, at time t = t0. The condition for It to be negative yields uS(∫ βI) < (γ + δ)I. This means that basic reproduction number R0 depends not only on but also the kernel β which captures physical distancing and economic wealth interaction in the society. We choose I(x, z, e, t0) to be higher for very high e to capture the initial infection from (international) travelers and people who are in touch with international businesses. Then the contact rate kernel β is inversely proportional with e. The spread is transferred to areas with low income, the dynamics exhibits a higher infection of COVID-19 in some places with low income per family size. However the spread transfer from higher income population to a population with lower income may be slower in some countries due less interaction β with infectious people.
2.3 Control of SIRD with spatial and age structure
We introduce the control of the SIRD (susceptible-infected-recovered-deceased) dynamics with spatial location x, socio-economic status e and age-structure z. Let t0 < t1 and (x, z, e, t) ∈ ℝ 3 × ℝ × ℝ × (t0, t1). where the number of infectious people inside the ball ℬϵ (x), within two meters radius ϵ is given by
u is a control functional, ϵ (.) is a diffusion coefficient, Δx denotes the Laplacian operator, divx denotes the divergence operator, β((x, z), (x′, z′)) is the interaction kernel by age and spatial position of the agents, is the fertility rate of the population of Susceptible (S), Infected (I), recovered (R), is the death rate by age and infection status, v is the velocity (three-dimensional), and e is the economic wealth.
Then the optimal control of SIRD becomes a Mean-Field-Type Control Problem, and is given by
2.4 A basic SIRD using Mean-Field Game
We formulate a Mean-Field Game Problem as follows
Note that Problem (3) is linear w.t.r the measure µ (but non-linear in m).
2.5 A basic SIRD using Mean-Field-Type Game
We now introduce a basic Mean-Field-Type Game (MFTG) as follows
Note that Problem (4) is nonlinear w.t.r the measure µ (and non-linear in m).
3 Data-Driven MFTG
There are K ≥ 1 countries distributed over a three-dimensional geographical area. A projected map is given in Figure 2. The set of countries is represented by a set of locations {1, …, K} with geographical coordinates (latitude, longitude) on the map.
3.1 Authorities, Firms and People
Authorities: In each country/location, there is an authority who can decide on the testing policy (when, how many), migration, confinement policies of the country and international connections via flights, boats and ground transportations. The authority in l, has a testing capacity, and the migration rate ηll′, l′ ∈ 𝒩l, where 𝒩l is the set of connecting areas/cities/countries from l. The intra-city mobility in city l3 is represented by the mobility rate η3, the inter-city mobility from/to cities l2 is represented by the mobility rate η2, and the international connectivity from/to country l1 is represented by the mobility rate η1.
The country specific seasons and temperature field T (t, l) are given. The temperature field will be taken into consideration when making a decision on the duration of a lockdown. In each country in {1, …, K} there is a subpopulation of people, called agents. The distribution of age of all agents in country l is given by ρ(l, dz, t) where the (chrono-logical) age z ranges from 0 to 124 years. The function ρ is fitted from real data set making it heterogenous from one country to another. The age distribution data per country is available in [32]. An illustrative location-specific age distribution (unnormalized) function is displayed in Figure 4. Figure 3 illustrates a sample (unnormalized) death rate per age-structure. A spectrum of gender and height are also given.
The authorities play the role of atomic decision-makers as their decisions can influence significantly the internal and external migration from that country, and the subpopulation at that location. The authority of country l aims to reduce infections, deaths and economic losses.
The authorities strategic objectives are the following:
− reduce the number of deaths
− Interrupt agent-to-agent transmission including reducing secondary infections among close contacts and preventing further spread at a larger-scale;
− Identify, isolate and care for patients early (testing-tracking-tracing);
− Realizing a testing strategy at a given time window. The testing capacity can be improved by the authority in l over time t.
− Minimize social and economic impact through multisectoral partnerships between countries, geographical locations and airports hubs.
Firms: Production goods are classified as essential, moderate-essential and less-essential. Each firm decides its total working hours and aims to reduce the number of infected employees.
Agents per country: Each agent has an age z and a position located in one of the countries. An agent of any age can be infected by a virus (SARS-CoV-2 for the Coronavirus disease 2019). An agent with located in area l3 of city l2 of province state/country l1, age z,preexisting health condition (c), gender(g), height (h), family size (fs), average revenue per family. We rename (c, g, h, fs) of the agent to be c = (c1, c2, c3, c4). and has an infection status described as follows.
− For untested individuals, there are six states. Non-infected agents are susceptible (lac-S). Non-infected agents including recovered agents can be re-infected by another mutated version (if any) of the virus. An agent can be exposed (lac-E), for example, by an infected agent who is coughing, sneezing, from a surface contact or from a directional air conditioning (if any). Some of the exposed agents become passive (lac-P) which means dormant (asymptomatic infectious or mild symptom) and some others become active (lac-A). One absorbing state: deceased (lac-D). Some of the passive or active agents become recovered (lac-R).
− Some of exposed people will be tested. There are ten additional states plus one specific state actioned by healthcare authorities to identify, isolate the potential exposed people for example by means of phone calls or contact-tracing and tracking. These states are Testing (lac-T), lac-testing negative, lac-testing positive, lac-testing unclassified/unknown, lac-contact tracing of the last two-weeks for each COVID-19 tested positive, lac-isolation at home, at hotel, at a point-of-care, lac-hospitalized (severely active case), lac-recovered, lac-deceased
The 17 infection status are given by
− lac-susceptible, lac-exposed
− lac untested: lac-passive-untested, lac-active-untested, lac-recovered-untested, lac-deceased-untested,
− lac-testing, lac-testing negative, lac-testing positive, lac-testing unclassified/unknown,
− lac-contact-tracing-tracking: for each COVID-19 tested positive agent, a contact-tracing-tracking action is taken by the authorities to reach out and identify people who may have been potentially exposed by getting in the close contact of the COVID-19 positive agent in the last two weeks. This is not an individual but a collective action taken by the authority. A person who has been identified from the list of contract-tracing person may be considered as potentially exposed and a testing decision will be made after evaluation.
− lac-isolation at home, at hotel, at a point-of-care
− lac-hospitalized (severely active case), lac-recovered, lac-deceased
Figure 6 illustrates the evolution of the 17 states over time.
3.1.1 Individual state
The internal state of individual p is (lp(t), xp(t), sp(t), zp(t), cp(t)) where
lp(t) = (l1,p(t), l2,p(t), l3,p(t)) is an index representing latitude-longitude of the location of person p at time t, l1,p for the country, l2,p for the city/province, l3,p for the area of the city. lp(t) ∈ ℒ describes a discrete set of cities map of the countries in {1, …, K}.
xp(t) = (x1,p(t), x2,p(t), hp(t)) represents the position of agent p at time t
sp(t) ∈ {S, E, P, A, R, D, s7, …, s17}is the infection status of person p at time t,
zp(t) ∈ [0, 124y] is the age of person p at time t, up to 124 years. The number y is a conversion from one year into a proper time units (miliseconds)
cp(t) = (c1,p(t), c2,p(t), c3,p(t), c4,p(t)) with c1,p(t) ∈ C1, which is the set of all subset of C1i = {phc1, phc2, phc3, …} a finite list of pre-existing health conditions (phc) that are considered as relevant by health authorities. Notice that C1 contains the empty set ∅. If person p has no prior health issue then c1,p(t) is set the empty set c1,p(t) = ∅ The set C1 includes also some combinations of diabetes (type 2), severe asthma, pulmonary hypertension, cardiovascular disease, high blood pressure, cerebrovascular disease, chronic obstructive pulmonary disease, chronic kidney disease, among the others. Older people and people with pre-existing medical conditions are assumed to be more vulnerable to becoming severely ill with the virus.
c2,p ∈ C2 represents the gender of person p. C2 is a finite set representing the spectrum of gender, and c3,p(t) = hp(t) ∈ C3 is the height of person p at time t. C3 represents the spectrum of height and C4 represents the set of family size or number of roommates at residential areas.
The set of states of an agent in location l ∈ ℒ is 𝒳 × C × Status where the infection status of the agent can be {S, E, P, A, R, D, s7, …, s17} and C = C1 × C2 × C3 × C4.
3.1.2 Global state
The population size is a large but finite number n. The global state of the system is the measure (M (l, dx, s, dz, c, t), l ∈ ℒ, s ∈ {s1, …, s17}, z ∈ [0, 124y], c ∈ 𝒞 where
𝒱x is a Lebesgue measurable set of positions at l and 𝒱z is a measurable set of ages at l. The intra-state status of the system in location l, age z and characteristics (c1, c2, c3, c4) at time t is given by representing a measure of the total number of people located at l for each of the 17 status. Let n(l, z, c, t) be the population size at location l with age z and characteristics (c1, c2, c3, c4) at time t. Let nl(t) = ∑c ∫n(l, dz, c, t) be the population size at location l at time t. The occupancy measure in area l3 of city l2 of country l1 is where V of the volume of the neighborhood 𝒱.
3.1.3 Individual state transitions
We describe below the transitions between infection status.
Susceptible agents can be exposed to the virus in different ways. (i) become exposed from a surface contact that has been infected recently. (ii) exposed from a coughing or sneezing of an infectious person (with or without symptoms) with rate λse
Exposed agents: Recent observations suggest that for the COVID-19, the risk of exposure can be affected by surface contact, coughing and sneezing (with a face mask vs without a face mask). We propose to examine the mathematics of coughing and sneezing in order to estimate the range and the risk of exposure of a person. A simple dynamics of coughing and sneezing is proposed in [36]. The trajectory of the puff is governed by the evolution of its momentum I and buoyancy B. The buoyancy force is vertical, causing the cloud to follow a curvilinear trajectory denoted by (w, θ, h) or equivalently (x, y, h) where V volume of the cloud, η = α = 0.1 constant, s distance from the source, vp speed, N is the number of particles within the cloud, with particles concentration at time t, ρc (density of the cloud), ρp ρa (density of the ambient air), B cloud buoyancy, g is the acceleration due to gravity, I = (Ix, Iy), momentum, I0 initial momentum, θ0 initial angle. Uw = wt is the speed of the suspended particles with a time scale of
The quantities-of-interest include the fallout time, position, the horizontal range of deposition. These quantities affect the exposure to the range and possibly the transmission of the disease by emitted pathogens. We numerically compute the parameters β, λep,l by looking at the distance over which the cough/sneeze travelled around 0.7 m, An exposed person
− may become active with a probability λea(l)
− may become passive via two ways. (i) λSm is the probability of getting infected by a surface of contact or in close contact with an infected agent. The exposed agent can decide to receive or not an item or a visit to a workplace, grocery store, pharmacy, {o, ō}. The contact probability is ul(o | e, z, c). In practice, the meeting or contact probability are estimated through the path trajectory of exposed people and the use of physical distancing to an exposed person or an exposed surface. (ii) models the probability of encounter a passive agent where . An example of interaction between an exposed agent and an active or passive infectious agent is given by where β is the spatial exposure and cross-age kernel between positions x and x′ as in (7), and s ∈ {lac-S, lac-exposed}. For infection at homes and residential areas, we take the average family size per area/city/country. This is made endogenous to the model by setting c4 to the ‘family’ size (or number of roommates) of the individual.
Untested Passive agents: In practice, the confirmed cases (asymptomatic and symptomatic) are done through a two-step process via testing for COVID-19 which can be positive, negative or underdetermined with some probabilities (λpo, λne). Since testing is still limited in many countries, we estimate the untested passive agents from a tracking of exposed agents obtained by phone calls, surveys, mobile tracking, contact-tracing, among the others. A large number of the population has not been tested yet. A person in country l is untested with probability 1 − δl. where δl is the testing strategy adopted by the authority in l.
− may become recovered with probability λpr.
− may become susceptible with probability λps.
− may become active with a rate λpa,1.
− may opportunistically encounter with another infectious agent or has comorbidities, and both become active. The rate is . The passive agent can decide to be in physical contact or not with the other infected agent, so there are two possible actions: . The meeting probability is ul(m | p, z, c).
− may cause death with age.
− newborns with passive status with age 0. This will be considered as a very small number as in the (publicly) reported data set.
Untested Active agents:
− may become susceptible with probability λas(l, z, c, ρ) and depends on age.
− may become passive with probability . This will be considered as a very small number as in the (publicly) reported data set.
− may cause death with age. The death rate , due to the virus, increases with the age z of the agent. The distribution of age in location l solves where is the aging speed and is the death rate. The initial distribution of age in location l is ρ(l, dz, 0). The birth rate is ρ(l, 0, t) = ∫ fert(l, z, ρ)ρ(l, dz, t) where fert(l, z, ρ) is the fertility rate at age z in country l. Older people and people with pre-existing medical conditions such as severe asthma and pulmonary hypertension, high blood pressure, diabetes, are assumed to be more vulnerable to becoming severely ill with the virus.
− newborns with active status with age 0. The newborn part of the population contributes as ρ(l, 0, t).
Untested Recovered agents: Depending on the type of virus mutation and immunization of the agent, a recovered agent can be re-infected by a mutated version of the virus and hence,
− may become susceptible with λrs(z, c) but it is assumed to be small as suggested by the publicly reported data set.
− may become passive with λrp(z, c) but it is assumed to be small as suggested by the publicly reported data set.
Untested Death: this is an absorbing state. No transition to any other state once in. The death rate depends not only on the age of the patient but also on the temperature field or season at location l. It also depends the capacity of the Healthcare system at location l. The constraint is , where is the capacity of the healthcare system at location l at time t. When the capacity is exceeded, the death rate increases drastically. Authorities anticipate the saturation and find alternatives such as installations of equipments at stadiums and new hospitals. An aggregated hospital capacity data set is available from World Health Organization and World Bank Group [35]. Depending on the country, there is also a certain rate λco of undetected COVID-19 patients (not reported at hospitals or from their homes) that lead to death. Figure 3 illustrates a data-driven death rate by age.
lac-testing: During a period of asymptomatic infection, the testing can reveal infection depending on the outcome. The testing result can lead to different outcomes and the isolation can be at home, hotel or point-of-care as displayed in Figure 5. Testing occurs in country l with probability δl.The result of the testing is often classified as positive, negative or underdetermined (unknown). Quarantine at home, hotel or point-of-care policy is case-dependent. It can be dependent on whether a case is unknown, known positive, known negative, or recovered. Testing therefore makes possible the identification and quarantine of infected individuals and release of non-infected individuals. Each country has its own testing capacity and testing rule δl (deciding by the authority at l) and the constraint is captured by
The testing state moves to one of the three states: lac-testing negative, lac-testing positive, lac-testing unclassified/unknown.
lac-contact tracing of the last two-weeks for each COVID-19 tested positive agent.
lac-isolation at home, at hotel, at a point-of-care:
− may become hospitalized with probabilities respectively.
− may become recovered. The rates are ,
− There may be also a transition from home to one of the point-of-care with probability .
lac-hospitalized (severely active case),
− may become recovered with rate .
− may become dead with rate .
lac-recovered
− may become susceptible with rate λtr,s(z, c) but it is assumed to be very small.
− may become exposed with rate λtr,e(z, c) but it is assumed to be very small.
lac-deceased: this is an absorbing state. There is no transition to any other state once in.
When δl = 0, the model (without testing at locality l) reduces to six basic individual states. Figure 5 represents some of the intra-dynamics for both tested and untested individual at a given location wit pre-existing health condition c, gender, height, family size, age z, and untested options. Connection with SIR, SEIR, SIRD, SEIRD can be found in Figures 7, 8, 9 and 10.
3.1.4 Mobility per area/city/country
The mobility pattern of the agents through ground transportation within the same street/area of the city, between areas of the cities, between cities via ground and air transportation and international connectivity are given by a mobility map or graph with rate functions ηllI (T, C, c, λ, m, ρ, u, v, δ, n) between locations l and li. We decompose η into three layers: η1 connecting countries, η2 for connecting cities, and η3 for connecting areas of the same city.
The mobility pattern by area per city can be formulated as follows. Let l = (l1, l2, l3) be a location of a specific area l3 of the city l2 of region/country l1 and x = (x1, x2, h) ∈ ℝ3 a geographical location of an individual agent inside the area l3. Let m(l, x, s, z, c, t) be a measure of the agents satisfying the Kolmogorov equation (18 in Table ??). The limiting measure m can be written as
The evolution of fraction of agents per state is given by
The number of agents for which there is a transition in one time slot is always less than 2.
Payoff contribution from switching : .
The intensity of interaction is in the order of 1/nl.
The drift (the expected change of Mn in one time-step given the current state) of the system is: given in (??) and where j ∈ {7, …, 16}, and m is the actual population mean-field profile as a function of (l, x, s, z, c, t) given in (18) where yq denotes a quarantined agent with status y. Agents with non-empty pre-existing conditions c = ∅ are handled separately from the above dynamics with a certain quarantine rate at the early stage. Here for z > 0 and The new borns are counted in the calculations of m from the functional with age z = 0. The transition rate is obtained from the functions λ following the diagram in (??).
As nl → + ∞ the occupancy measure Mn(t) converges in probability to the solution of the (controlled) differential equation given by
starting from m(., t0) = m0(.).
The convergence is in distribution. It is, in general, weaker than the convergence in probability. However, here, the limiting object m is a deterministic (parametrized) object, and hence, the convergence is also in probability. The proof follows from [27, 28] by using a martingale approach with an infinitesimal generator driven by the drift where the limit is taken in (??).
3.1.5 Data-driven mean-field trajectories
We build upon the works on mean-field-type filters developed in [37–39] to incorporate the economic data and the pandemic data together in the model. We tune the parameters of the model such that the solution of the dynamical system becomes as close as possible to the dynamically changing data set. We rewrite the model with the parameters λ to be tuned from the reported data set using the mean-field dynamics in Proposition 1.
The geographical coordinates are taken from the world map data set.
The sequences of reported data set ŷl(tk), k ∈ {0, 1, …,}, tk ≤ t is given. The reported data set is observed at discrete time instants t0 < t1 < t2 < …. tk captures the time instant at which the k-th data point is officially reported by the corresponding state and healthcare authorities at location l. The reported data set is a sequence of aggregated data set: ŷl(s, tk) where s ∈ {tr,td,testing, t+}. The conditions of untested agents are not in the reported data set. The reported data set is assumed to be noisy and inaccurate.
The reported data set is used to construct a filtered trajectory minimizing the global error within [t0, t1] by choosing the tensor λ and hl. Then the resulting dynamics is used to forecast the mean-field within an uncertainty region. The sequence of optimization problems up to time tk, is given by
where for t ∈ (ti−1, ti). As a new data point is observed, we need to re-update the estimation over the integral [0, tk+1] by solving the new minimization problem and so on. The continuation of the mean-field starts from m(tk) at time tk.
The above optimization provides λdata, hdata that are specific to COVID-19 country-specific spread. Moreover, λdata and hdata are context-aware as they depend on country-specific, age, gender, testing and pre-existing health conditions. The data-driven dynamics yields with m(., t0) = m0(.). Instead of above, the estimation is replaced by the distribution of leading to a mean-field-type filtering problem [37–40]. The diagram of Figure 1 represents the main steps of the algorithm of the data-driven MFTG model.
3.1.6 Beyond pairwise interaction
We have detailed above the pairwise interaction case with a kernel β in (9). A 3-wise, 4-wise,… j-wise interaction case, depending on the local markets, supermarkets, pharmacies, family members at home, and the number of infected people at the neighborhood 𝒱 (x) can be formulated as where λj(l, x) denotes the probability of having j agents around x.
3.2 Risk-aware MFTG
Introduce and . The benefit-cost functional is a risk-aware function as it depends explicitly on the variance. Each individual agent chooses u and solves the following best response problem given (m, η) with ∂𝒟 being the border of the map 𝒟, and b1 is the benefit of the agent, h1, h2 are positive and non-decreasing functions (in the second variable for h2.)
The l-th authority chooses a finite dimensional vector η and solves the following best-response problem given m and given the migration restrictions imposed by the other authorities. where and and gl(η, δ, m) is the economic cost-benefit at l.
3.2.1 Mean-field-type equilibrium
The MFTG problem is then obtained by combining the two interdependent problems (13) and (12). A solution (u∗, v∗, η∗, δ∗) to this MFTG problem is a mean-field-type Nash equilibrium.
If there is a solution to the following Hamilton-Jacobi-Bellman system in the space of measure 𝒫2(ℝ2× S × [0, 124y] × C) given by
where 𝒫2 is the set of square (in space and age) integrable (positive) measures and
where the Hamiltonians are given by
Then, (u∗, v∗, η∗, δ∗) is a mean-field-type Nash equilibrium.
The proof of Proposition 2 follows similar steps as in [20]. From the formulation of the minimization problem, the number of infected people and the number deaths are significantly reduced at the mean-field equilibrium. However, the cost-benefit analysis suggests that it depends on the functionals g0, gl, h1, h2.
The master adjoint system (MASS) is given by
The MASS system provides the equilibrium strategies but not necessarily the equilibrium cost.
Other relevant payoffs
Authorities The authority in locality l decides on
− Migration rules (lockdown, confinement, isolation, curfew) (η1,l, η2,l, η3,l), testing strategy δl
− Budget allocation and incentives lsl, trl
Multiple objectives for authority in locality l:
− reduce the number of deaths, infected,
− reduce economic losses
− maximize the number of recovered-deaths
Consumption goods firms Firms (producing essential, moderate-essential, less-essential goods) in locality l decide on
− Production, Total hours worked A1 = ∫a1m
− Budget constraint
Multiple objectives for j:
Reduce the number of infected employees,
Maximize profit
Each individual’s risk-awareness An individual in locality l decides on
− Meeting rate a2, local movement x, v, a3, consumption a1
Multiple objectives for an individual
reduce the risk of being infected,
reduce the risk of exposing the others (if co-opetitive),
reduce economic losses for susceptible: economic: pandemic:
4 Model outputs and Data visualization
4.1 Data used and constraints
We present some illustrative examples with K = 195 countries and more detailed analysis for 66+ countries. The COVID-19 data set is taken from [10–13, 33]. The age-structure data per country is provided in [32]. For the international connectivity graph, η1 we have used Flightradar24 [31], which is a global flight tracking service that provides information about flights from 1200 airlines and 4000 airports around the world. The data set does not capture all international connections but it provides a sample. As the initial propagation of COVID-19 started from flights we have modulated the flight data set to capture it. After a certain waiting time, the soft/partial/hard flight restriction decision has been made between some countries. Those countries we made η1,l′l to be zero or a very small number (0.001) if travels from l′ to l are restricted. For the geographical mobility and co-location map of people in country l1, we have used OpenStreetMap before and after confinement decisions (if any) per country. The Google Community Mobility Reports can be found in [34].
The country-specific hospital capacity data set is obtained from World Bank Group [35] and is used to capture
However, the data set is limited to the country level and it is not per area/city/county. The country-specific testing capacity data set is not available but the number of tests per country is provided in [12]. The set 𝒟 = ∪l∈ℒ 𝒟l where 𝒟l is a constrained non-empty set. Given two countries l1 and , the dynamics m(l, x, s, z, c, t) and m(li, x, s, z, c, t) will be related to each other via the switching rates η1. Table 2 summarizes the heterogenous data sources. Figure 12 represents the proportion of infected people
at the 195 countries as of March 30, 2020. Figure 13 illustrates the status of the fifteen most (reported) infected states as of April 1, 2020. Figure 14 samples G7 countries. Detailed testing results are displayed in Figure 15 for the cities of New York and Los Angeles.
We make the following observations form the model.
the model is flexible enough to capture some local context-awareness including local mobility and age structure per country. The provided model can be filtered to track the data set provided in [12] even for some of the countries with few number of cases.
The connectivity of the local graph η3 plays a key role in the propagation of the virus across different areas of the city
The connectivity of the local graph η2 plays a key role in the propagation of the virus across different cities
The connectivity of the local graph η1 plays a key role in the propagation of the virus across states/countries.
The current case fatality rate (data-CFR) per country data set is the ratio between the number of (reported) confirmed cases and the total number of (reported) deaths from COVID-19 outbreak data set [33] for that country. We observe that there are some significant differences in terms of data-CFR per country even for whose who have made similar decisions at similar infection time. Some of the differences comes from the implementation in practice and the response of the people to the measures. The proposed MFTG model captures the data-CFR per country as illustrated in 66+ countries.
4.2 Visualization
The optimization algorithm is implemented in C++ using [41]. The plots on the geographical-map (see Figure 12) are done using MATLAB. We combine several layers: the transportation layer (air and ground), economics layer, population mobility map after the first infection of COVID-19 in that city/country, and the epidemic data-driven model. We display below the outputs (tested positive, tested-recovered, tested-dead, tested-active) of the epidemic model in 66+ countries. We choose the constraints to be high enough so that the interior solution can be found. This means extending the capacity by means of new resources. The COVID-19 initial death rate is chosen as in Figure 3 for the countries where the death rate due to COVID-19 is unknown. We estimate the death rate from the model using the reported data [12]. We choose k = 0.7m(.) for the numerics at the exits on the grid over OpenStreet map. The rate λap is set 10−7 or zero depending on the country/locality. The other λ functions are obtained from the minimization with the smallest norm allowed to derive λdata. As new data comes, λdata can be recalculated. The newborns are taken from the last year growth scale. From λdata we plot the behavior of the dynamics (18) and integrates it out. The initial t0 is set to January 22, 2020 for most countries.
We observe that the proposed model captures the behavior in most countries. The data-driven MFTG model reveals that the number active COVID-19 positive patients in countries such as Venezuela, Germany, Italy, Malaysia, Senegal, Zambia, Togo have several local peaks and oscillating behaviors. Some of them cannot be fitted with a Gaussian or bi-Gaussian. It cannot be fitted with a single exponential. By changing hl from 0.1 to 0.9 we observe a limit cycle of the dynamics. Figures 89 and 90 represents a COVID-19 sample of Non-Gaussianity and non-exponential. Also, Non-Gausianity of the reported active cases and hospitalized cases are observed in 15+ countries. It includes Albania, Azerbaijan, Bahamas, Bosnia, Costa Rica, Djibouti, Georgia, Irak, Iran, Jordan, Lebanon, Lituana, Malta, San Marino, Uzbekistan, etc. Tables 4, 5, 6 summarize lists of figures and illustrations.
Limitations of the proposed model
The proposed data-driven modelling has several interesting features including epidemiological aspects (hospitalization, tested negative, untested active, untested deceased, untested active) as well as economics aspects (average revenue per family size, income status, on-site worked hours during COVID-19) and mobility aspects (pedestrian within a sector, within a city, between cities, provinces, countries by ground/air/sea transportation, and movement to hotspots, local markets, grocery stores, pharmacies, etc). However, the model still need extension and integration of food security and supply chain which are relevant aspects in many countries.
Data Availability
https://github.com/CSSEGISandData/COVID-19
5 Conclusion
Understanding the transmission characteristics of infectious diseases in communities, counties, provinces, regions, states and countries can lead to better approaches to decreasing the transmission of these diseases. In this paper, a context-aware data-driven MFTG model. The proposed epidemiological and economic model can track and captures the country-specific COVID-19 data in 66+ countries. The context-aware data-driven MFTG model can be used in visualizing, comparing, planning, implementing, evaluating, and optimizing various detection, testing, prevention and control programs.
Data-Driven MFTG modeling can contribute to the design and analysis of the pandemic COVID-19 surveys, suggest crucial data that should be collected, identify trends, make short-term forecasts, and quantify the uncertainties in short-term forecasting. In addition to predictive safety measures, COVID-19 opens also other challenges such as
Supply chain: Identifying ways to optimize supply chain among many different industries, so the impact of the COVID-19 can be minimized.
Testing, Tracing and Tracking: Finding ways to better test, trace and track the infection propagation including those asymptomatic infected, best practices in how to test, trace and track in both rural and urban areas as early as possible.
Community: Finding solutions to prevent social and economic disruption in all communities. Proposing innovative de-confinement strategies in how to create a smooth transition from physical distancing to guaranteeing minimum services and quality-preserving activities and back to regular activities.
Number of technical questions remain unanswered. This include (i) impact of the propagation in local economy (including informal economy), (ii) the duration of the limit cycle of active cases, (iii) the stabilization of the system by means of controls and co-opetitive decision-making, (iv) impact of COVID-19 on food insecurity. We leave these questions for future research.
Acknowledgment
The author would like to thank the six webinars’ audience at UCLA IPAM, NYU, Next Einstein Forum, CNRIA 2020, CDC meeting on COVID-19, and IEEE St-Maurice section for their valuable inputs. The author would like to thank Boualem Djehiche, Nader Masmoudi, Julian Barreiro-Gomez, and Salah Eddine Choutri for interesting discussions on the topic of data-driven modelling. We gratefully acknowledge support from U.S. Air Force Office of Scientific Research under grant number FA9550-17-1-0259 on Mean-Field-Type Game Theory and the Center on Stability, Instability and Turbulence.