Efficient network immunization under limited knowledge

Yangyang Liu; Hillel Sanhedrai; GaoGao Dong; Louis M. Shekhtman; Fan Wang; Sergey V. Buldyrev; Shlomo Havlin

doi:10.1101/2020.04.07.20056606

Abstract

Targeted immunization or attacks of large-scale networks has attracted significant attention by the scientific community. However, in real-world scenarios, knowledge and observations of the network may be limited thereby precluding a full assessment of the optimal nodes to immunize (or remove) in order to avoid epidemic spreading such as that of current COVID-19 epidemic. Here, we study a novel immunization strategy where only n nodes are observed at a time and the most central between these n nodes is immunized (or attacked). This process is continued repeatedly until 1 − p fraction of nodes are immunized (or attacked). We develop an analytical framework for this approach and determine the critical percolation threshold p_c and the size of the giant component P_∞; for networks with arbitrary degree distributions P(k). In the limit of n → ∞ we recover prior work on targeted attack, whereas for n = 1 we recover the known case of random failure. Between these two extremes, we observe that as n increases, p_c increases quickly towards its optimal value under targeted immunization (attack) with complete information. In particular, we find a new scaling relationship between |p_c(∞) − p_c(n) | and n as |p_c(∞) − p_c(n)| ~ n⁻¹ exp(−αn). For Scale-free (SF) networks, where P(k) ~ k^−γ, 2 < γ < 3, we find that p_c has a transition from zero to non-zero when n increases from n = 1 to order of logN (N is the size of network). Thus, for SF networks, knowledge of order of logN nodes and immunizing them can reduce dramatically an epidemics.

Networks play a crucial role in many diverse systems [1–8]. Connectivity of components is critical for maintaining the functioning of infrastructures like the internet [9] and transportation networks [10], as well as for understanding immunization against epidemics [11] and the spread of information in social systems [12]. Due to this importance, researchers have long focused on how a network can be optimally immunized or fragmented to prevent epidemics or to maintain infrastructure resilience [3, 13–17]. Many approaches have used percolation theory from statistical physics to prevent the spread of virus or assess network resilience under the infection or failure of some fraction of nodes or links [18–28].

Early studies in networks found that immunizing real networks against an epidemic is highly challenging due to the existence of hubs which prevent eradication of the virus even if many nodes are immunized [29–31]. At the same time, if these hub nodes, largest degree nodes, are targeted, the network can reach immunity [29, 31, 32]. However, these previous models of targeted immunization have assumed full knowledge of the network structure which in many cases is not available. Recent research has shown that even those in control of a network often have knowledge only of a small part of the whole network structure [33–35]. This was particularly demonstrated with the current COVID-19 epidemic where the detailed social network of individuals is unknown.

In this Letter, we study targeted immunization or removal in networks with limited knowledge. We assume that at each stage, a number n nodes are observed and the node with highest degree is immunized and thus unable to continue spreading infection. This procedure is repeated until 1 − p fraction of nodes are immunized. In particular, our model could apply to a situation where several cooperative teams are sent to immunize a network and each team has access to information on a small subset (n nodes) of the network. We develop a theoretical framework for this model of immunization with limited information using percolation theory for networks with arbitrary degree distribution. In the limit of n = N we recover prior work on targeted attack [32], whereas for n = 1 we recover the case of random failure [30, 31]. We observe excellent agreement between our theoretical and simulation results for the n dependence of the critical threshold p_c and the size of the giant component P_∞; for p > p_c. The giant component and p_c characterize the efficiency of the immunization. The smaller is the giant component, the immunization strategy is better. The larger is p_c the immunization is more efficient since less immunization doses are needed to stop the epidemics. We find an analytical relationship between n and p_c for both Erdős-Rényi (ER) and Scale-Free (SF) networks. Surprisingly, we also find that p_c quickly reaches a plateau even for relatively small n, after which increasing n has negligible effect on p_c. This means that immunization with small n (not knowing the whole structure) can dramatically improve the immunization.

Let G(V, E) be a network where V and E are the set of nodes and edges, respectively. N = |V| is the number of nodes in the network. We assume that a preventer or attacker has limited knowledge of the overall network structure and instead possesses only limited information on several nodes. Specifically, we randomly select n nodes for which the preventer or attacker is assumed to have information on the node degree. The preventer or attacker then targets the node with the highest degree among these n. This procedure is then repeated until a 1 − p fraction of nodes are immunized or removed from the network.

In Fig.1, the limited information immunization (attack) is illustrated together with global targeted immunization on a network. Here a total of n = 3 nodes are observed. In panel (a), an individual with global information about the network structure chooses the highest degree node u to immunize (or remove). However, in panel (b), the individual only knows at a time the degree of 3 nodes in the network, i.e. v₁, v₂, v₃. Consequently, node v₃ with the highest degree k = 4 (marked in red) is immunized (or removed).

Suppose the degree distribution of a network is given by P (k) and being the cumulative probability that the degree of a randomly chosen node is less than or equal to k. Furthermore, at an arbitrary time t during the iterative percolation process, assume the distribution of the original degree (including the immunized neighbors) of the remaining nodes is P (k, t). Then, the degree distribution of the node which is immunized at time t is given by where F (k, t) is the cumulative distribution of P (k, t). This formula can be recognized as being derived from order statistics giving the maximum of several independent random variables [36]. For k = 0, Eq. (1) becomes P_r(0, t) = F (0, t)ⁿ. Hence we define F (k − 1, t) = 0, and then Eq. (1) is valid for k ≤ 0.

In a limited knowledge immunization or attack, each node’s immunization changes the degree distribution of the remaining nodes in the following way where N (k, t) is the number of nodes with degree k at time t, and P_r(k, t) is the likelihood that a node immunized at time t has a degree k.

Then, plugging Eq. (1) into Eq. (2) gives, which becomes in the continuous limit,

Substituting N (k, t) = (N − t)P (k, t), we get and using P (k, t) = ΔF (k, t), we obtain,

Note that F (k = − 1, t) = 0, and thus the entire term inside the Δ is 0 for k = −1. Similarly, this implies that for k = 0 and likewise for any k ≤ 0 this term is also 0. Thus, we get the following simple ordinary differential equation, with the initial condition F (k, t = 0) = F (k). It can be shown (see Sec I in SM) that the solution of this ODE, Eq. (3) is or equivalently, where F_p(k) is the cumulative distribution of the degree after immunizing (removing) 1 − p fraction of nodes. For n = 1, the solution of Eq. (3) is F_p(k) = F (k) as expected. Also Eq. (5) converges to F (k) in the limit n → 1.

We can now obtain the degree distribution of the occupied nodes after fraction 1 p nodes are immunized, which is given by

The equation for v, the probability of a randomly chosen link to lead to a node not in the giant component, is where P (Θ|k) is the probability of a node to be occupied given its degree is k. Using Bayes Theorem, we note that P (k)P (Θ|k) = P (Θ)P (k|Θ) = pP_p(k). Hence Eq. (7) becomes

The giant component S can be found by where v is found from Eq. (8).

At criticality, we take the derivative of both sides of Eq. (8) and substitute v = 1 representing the location where the first solution with v < 1 exists, as opposed to only the v = 1 solution. Thus, the critical condition is

We now study our limited knowledge immunization (attack) strategy, i.e., the general result, Eqs. (8) and (9), on ER networks. First, we analyze the giant component P_∞;. For the case n = 1, limited knowledge immunization (or attack) reduces to the classical random attack, while for n → ∞ (meaning the global network is observed) corresponds to targeted attack [14, 29, 31, 32]. Using Eq. (9), the giant component P_∞; can be solved numerically for any given p. In Fig. 2(a), simulations and analytic results are shown for the giant component P_∞; as a function of 1 − p under limited information immunization with different n. As the knowledge index n increases from 1 to N, limited knowledge immunization moves from being like random immunization (attack) to being like targeted immunization (attack). The simulations are in excellent agreement with the theoretical results (lines).

FIG. 1.

Schematic illustration of our limited knowledge immunization or attack strategy. The preventer or attacker is able to observe the degree of nodes that are colored green, while the gray nodes are unknown. (a) For the classical targeted immunization (attack), one has complete information on the global structure of the network and chooses the highest degree node (u) to immunize (remove). (b) Here the case of an individual with limited knowledge of the network is demonstrated. In this figure, we set n = 3 and only degrees of nodes v₁, v₂ and v₃ are known. Given this limited information, the preventer would choose to immunize v₃, being unaware that an unobserved higher degree node exists. At the next immunization or attack, only nodes which have not been immunized or removed yet will be observed.

FIG. 2.

Results on ER networks. (a) The giant component P_∞; of an ER network with k = 4 varies with the fraction of immunized (or removed) nodes 1 − p under limited knowledge. As the n is increased, limited knowledge immunization tends to have the same immunization effect as targeted immunization. (b) The critical threshold p_c of limited knowledge immunization as a function of n on ER networks. Note that already for small n ~ 10, p_c is close to targeted immunization (global knowledge, n ~ N). (c) Critical threshold p_c as a function the mean degree ⟨k⟩ of ER networks for limited knowledge immunization. (d) The scaling of |p_c(∞) − p_c| with n on ER networks. Symbols are average results of simulations over 100 independent realizations on ER networks with 10⁶ nodes. All simulation results (symbols) agree well with theoretical results from Eq. (10) (dashed lines).

Next, we focus on the critical threshold, p_c, of limited knowledge immunization (attack). Overall, we find that one does not need a very large knowledge n ~ 10 to achieve nearly the very close effect as targeted immunization (attack) with complete information. This can be seen by observing the critical threshold p_c as a function of n in Fig. 2(b). In Fig. 2(c) we show the variation of p_c with ⟨k⟩ for several fixed n.

Next, the behavior of p_c in the limit of large n is derived analytically. By examining Eq. (5) we notice that when n → ∞ there are two distinct behaviors depending on whether k is small, F (k) < p; or k is large, F (k) > p. It can be shown (see Sec II in SM) that the leading term behaves as, where α_k = |log [p/F (k)]|. In the limit n → ∞, we can get the expected result for targeted immunization (attack), F_p(k) = min {F (k)/p, 1} [31, 32].

Plugging Eq. (11) into Eq. (10), and noting that from a sum of exponentials decaying with n only the lowest decay rate contributes to the leading term, we obtain (see Sec II in SM) where , and the decay rate α is now

The pre-factor , where k_< is the largest degree such that and k_slow is the degree which gives the lowest rate α. (See illustration in SM).

It is clear that k_slow must be k_< or k_> because F (k) is monotonic. If F (k_slow) = F (k_>) = 1 then k_< should be taken as k_slow, and the corresponding α should be taken. It should also be noted that if k_slow is not unique, it would simply change the pre-factor A in Eq. (12). Another special case is where , then (see Sec IV in SM).

Fig. 2(d) shows as a function of n. As expected from the theory, one can see that Δp_c ~ 1/n for small n and exponential decay for large n. When p_c → 1 which occurs for ER network when ⟨k⟩ → 1, the power law regime becomes much broader as explained in the Sec II of SM.

Next, we study SF networks with P (k) = Ak^−γ, k = m, …, K, where A = (γ − 1)m^γ−1 is the normalization factor, and m and K are the minimum and maximum degree respectively [30]. Similar to ER networks, the size of the giant component, P_∞; can be obtained from Eq. (9). Fig. 3(a), shows P_∞; as a function of 1 − p for different n values. The results demonstrate that SF networks become more immunized/vulnerable compared to ER networks under the immunization/attack as n increases. Compared with ER networks, one can observe that slightly higher values of n (more knowledge) are needed to reach the near-steady-state region of fully targeted strategy.

FIG. 3.

Results for SF networks. Comparison of theory (lines) and simulation (symbols) for limited knowledge immunization or attack, n, for SF networks. (a) The size of the giant component versus n for SF network with γ = 2.5. (b) Critical threshold p_c versus n on SF networks. Simulations are obtained for a system with 10⁶ nodes and averaged over 100 independent realizations. The minimum and maximum degree of network are m = 2 and K = 1000 respectively.

For SF networks with 2 < γ < 3, under random immunization/attack (n = 1), it has been shown that p_c = 0 for an infinite system [30], while for high-degree immunization/attack (n → ∞;), p_c > 0 [31, 32]. Next we wish to find out for which n, p_c becomes non-zero, and how it depends on the system size N. To this end, we analyze Eqs. (5) and (10) for large k (high degrees govern the behavior in SF), small n and p as follows (elaborated in SM).

It can be shown that for large degrees,

Substituting this into Eq. (5) and assuming (k/m)^γ−1 ≫ n for large degrees, it can be concluded that

In addition, we notice that P_p(k) has a new natural cut-off, K_p, which depends on p and N as follows (see Sec III in SM),

This helps us to evaluate the second moment of P_p(k) where β = (3 − γ)/(γ − 1).

Considering this, and plugging Eq. (13) into Eq. (10), and keeping the leading terms in the limit of large N, we obtain (see Sec III in SM for more details) where

From Eq. (14), it is easy to see that if n ≪ logN then p_c → 0, while if n ~ logN then p_c is non-zero. The pre-factor C(n) depends on n but not in N.

Fig. 4(a) shows p_c versus γ. It is known that for 2 < γ < 3 and n = 1, if N → ∞ then p_c → 0 [30]. Also for n = 5, we can see that system size matters and p_c decreases as N increases. Fig. 4(b) shows that the scaling with n/ logN of Eq. (14) is valid. Furthermore, it is seen in Fig. 4(b) that when n is small or N is large, such that n/ logN ≪ 1 (in Fig. 4 it is 0.07), p_c approaches 0. Fig. 4(c) supports the exponential scaling of p_c versus n⁻¹ logN obtained analytically in Eq. (14).

FIG. 4.

How p_c for SF networks depends on γ and the system size N. (a) p_c as a function of γ for different values of n and N. p_c decreases with increasing N. For n = 1 and 2 < γ < 3, it is well-known that p_c approaches zero for infinite system. (b) p_c as a function of n/ logN for SF network with γ = 2.1. The enlarged figure for small n = 1, …, 5 are shown in the inset, and p_c approaches zero for n/ logN ≪1. (c) The scaling of p_c with N and n for large N and small n. Here C(n) is the pre-factor. The minimum and maximum degree of nodes are m = 2 and K = N ^1/(γ−1) respectively. This confirms Eq. (14) for γ = 2.1, and δ = (3 − γ)/2 = 0.45.

In summary, our results provide a framework for understanding and carrying out efficient immunization with limited knowledge. Especially in cases of global pandemics such as e.g., the current COVID-19, it is impossible to know the full interactions of all individuals. Thus an effective way to limit spreading is obtaining information on a few (n) individuals and targeting the most central of these. For example, testers could stand at a supermarket and select a group of people entering the store simultaneously. Information on the connections of these people e.g., the number of people they live with, where and how often they meet with other people etc. could be quickly obtained (such as through cell phone tracking) and then the individual with the most connections in the group could be quarantined or immunized. Our results demonstrate that even when this is done in small groups of people (low n), it is possible to obtain a significant improvement in global immunization compared to randomly selecting individuals. In our model, this was seen by the reduced size of the giant component and the large critical threshold p_c. Overall, these findings could help to develop better ways for immunizing large networks and designing resilient infrastructure.

Data Availability

All the available data and algorithms can be got by contacting the author.

References

[1].↵
Mark EJ Newman. The structure and function of complex networks. SIAM review, 45(2):167–256, 2003.
OpenUrl CrossRef
[2].
Réka Albert and Albert-László Barabási. Statistical mechanics of complex networks. Reviews of Modern Physics, 74(1):47, 2002.
OpenUrl CrossRef Web of Science
[3].↵
Reuven Cohen and Shlomo Havlin. Complex networks: structure, robustness and function. Cambridge University Press, 2010.
[4].
Reuven Cohen, Shlomo Havlin, and Daniel BenAvraham. Efficient immunization strategies for computer networks and populations. Physical Review Letters, 91(24):247901, 2003.
OpenUrl CrossRef PubMed
[5].
Dirk Brockmann and Dirk Helbing. The hidden geometry of complex, network-driven contagion phenomena. Science, 342(6164):1337–1342, 2013.
OpenUrl Abstract/FREE Full Text
[6].
Yang-Yu Liu, Endre Csóka, Haijun Zhou, and Márton Pósfai. Core percolation on complex networks. Physical Review Letters, 109(20):205703, 2012.
OpenUrl CrossRef
[7].
Yang-Yu Liu, Jean-Jacques Slotine, and Albert-Laszló Barabási. Controllability of complex networks. Nature, 473(7346):167–173, 2011.
OpenUrl CrossRef PubMed Web of Science
[8].↵
Duncan J Watts. A simple model of global cascades on random networks. Proceedings of the National Academy of Sciences, 99(9):5766–5771, 2002.
OpenUrl Abstract/FREE Full Text
[9].↵
Réka Albert, Hawoong Jeong, and Albert-Laszló Barabási. Internet: Diameter of the world-wide web. Nature, 401(6749):130, 1999.
OpenUrl CrossRef Web of Science
[10].↵
Zoltán Toroczkai and Kevin E Bassler. Network dynamics: Jamming is limited in scale-free systems. Nature, 428(6984):716, 2004.
OpenUrl CrossRef PubMed Web of Science
[11].↵
Romualdo Pastor-Satorras and Alessandro Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters, 86(14):3200, 2001.
OpenUrl CrossRef PubMed Web of Science
[12].↵
Duncan J Watts and Steven H Strogatz. Collective dynamics of “small-world” networks. Nature, 393(6684):440, 1998.
OpenUrl CrossRef PubMed Web of Science
[13].↵
Lazaros K Gallos, Reuven Cohen, Panos Argyrakis, Armin Bunde, and Shlomo Havlin. Stability and topology of scale-free networks under attack and defense strategies. Physical Review Letters, 94(18):188701, 2005.
OpenUrl CrossRef PubMed
[14].↵
Xuqing Huang, Jianxi Gao, Sergey V Buldyrev, Shlomo Havlin, and H Eugene Stanley. Robustness of interdependent networks under targeted attack. Physical Review E, 83(6):065101, 2011.
OpenUrl
[15].
Sebastian Neumayer, Gil Zussman, Reuven Cohen, and Eytan Modiano. Assessing the vulnerability of the fiber infrastructure to disasters. IEEE/ACM Transactions on Networking, 19(6):1610–1623, 2011.
OpenUrl
[16].
Dirk Helbing. Globally networked risks and how to respond. Nature, 497(7447):51, 2013.
OpenUrl CrossRef PubMed Web of Science
[17].↵
Stephen Eubank, Hasan Guclu, VS Anil Kumar, Madhav V Marathe, Aravind Srinivasan, Zoltan Toroczkai, and Nan Wang. Modelling disease outbreaks in realistic urban social networks. Nature, 429(6988):180–184, 2004.
OpenUrl CrossRef PubMed Web of Science
[18].↵
Dietrich Stauffer and Ammon Aharony. Introduction to percolation theory: revised second edition. CRC press, 2014.
[19].
H Eugene Stanley. Phase transitions and critical phenomena. Clarendon Press, Oxford, 1971.
[20].
Sergey V Buldyrev, Roni Parshani, Gerald Paul, H Eugene Stanley, and Shlomo Havlin. Catastrophic cascade of failures in interdependent networks. Nature, 464(7291):1025–1028, 2010.
OpenUrl CrossRef PubMed Web of Science
[21].
Mark EJ Newman, Steven H Strogatz, and Duncan J Watts. Random graphs with arbitrary degree distributions and their applications. Physical review E, 64(2):026118, 2001.
OpenUrl CrossRef
[22].
Jianxi Gao, Sergey V Buldyrev, H Eugene Stanley, and Shlomo Havlin. Networks formed from interdependent networks. Nature Physics, 8(1):40–48, 2012.
OpenUrl
[23].
Gaogao Dong et al. Robustness of network of networks under targeted attack. Physical Review E, 87(5):052804, 2013.
OpenUrl
[24].
Louis M Shekhtman, Michael M Danziger, and Shlomo Havlin. Recent advances on failure and recovery in networks of networks. Chaos, Solitons & Fractals, 90:28–36, 2016.
OpenUrl
[25].
Armin Bunde and Shlomo Havlin. Fractals and disordered systems. Springer Science & Business Media, 2012.
[26].
Gaogao Dong et al. Resilience of networks with community structure behaves as if under an external field. Proceedings of the National Academy of Sciences, 115(27):6911–6915, 2018.
OpenUrl Abstract/FREE Full Text
[27].
A Coniglio, Chiara Rosanna Nappi, Fulvio Peruggi, and L Russo. Percolation points and critical point in the ising model. Journal of Physics A: Mathematical and General, 10(2):205, 1977.
OpenUrl
[28].↵
Flaviano Morone and Hernan A Makse. Influence maximization in complex networks through optimal percolation. Nature, 524(7563):65–68, 2015.
OpenUrl CrossRef PubMed
[29].↵
Réka Albert, Hawoong Jeong, and Albert-László Barabási. Error and attack tolerance of complex networks. Nature, 406(6794):378, 2000.
OpenUrl CrossRef PubMed Web of Science
[30].↵
Reuven Cohen, Keren Erez, Daniel Ben-Avraham, and Shlomo Havlin. Resilience of the internet to random breakdowns. Physical Review Letters, 85(21):4626, 2000.
OpenUrl CrossRef PubMed Web of Science
[31].↵
Duncan S Callaway, Mark EJ Newman, Steven H Strogatz, and Duncan J Watts. Network robustness and fragility: Percolation on random graphs. Physical Review Letters, 85(25):5468, 2000.
OpenUrl CrossRef PubMed Web of Science
[32].↵
Reuven Cohen et al. Breakdown of the internet under intentional attack. Physical Review Letters, 86(16):3682, 2001.
OpenUrl CrossRef PubMed Web of Science
[33].↵
Gang Yan, Georgios Tsekenis, Baruch Barzel, Jean-Jacques Slotine, Yang-Yu Liu, and Albert-László Barabási. Spectrum of controlling and observing complex networks. Nature Physics, 11(9):779, 2015.
OpenUrl
[34].
Yang Yang, Jianhui Wang, and Adilson E. Motter. Network observability transitions. Phys. Rev. Lett., 109:258701, Dec 2012.
OpenUrl
[35].↵
Yang-Yu Liu, Jean-Jacques Slotine, and Albert-László Barabási. Observability of complex systems. Proceedings of the National Academy of Sciences, 110(7):2460–2465, 2013.
OpenUrl Abstract/FREE Full Text
[36].↵
Paul G Hoel et al. Introduction to mathematical statistics. Inc. New York, (2nd Ed), 1954.