Abstract
Since the first governmental recognitions of the pandemic characteristic of the SARS-Cov-2 infections, public health agencies have warned about the dangers of the virus to persons with a variety of underlying physical conditions, many of which are more commonly found in persons older than 50 years old. To investigate the statistical, rather than physiological basis of such warnings, this study examines correlations on a nation-by-nation basis between the statistical data concerning covid-19 fatalities among the populations of the ninety-nine countries with the greatest number of SARS-Cov-2 infections plus the statistics of potential co-morbidities that may influence the severity of the infections. It examines reasons that may underlie of the degree to which advanced age increases the risk of mortality of an infection and contrasts the risk factors of SARS-Cov-2 infections with those of influenzas and their associated pneumonias.
1. Introduction and Context
The SARS-Cov-2 virus has spread in only a few months around the world and in late winter (2020) was designated as a worldwide epidemic by the World Health Organization. The virus causes the covid-19 infection that can manifest in mild flu-like symptoms or far more seriously as a severe respiratory disease with pneumonia
Since the outset of the covid-19 pandemic, the public has been treated to numerous speculations concerning the degree to which age or various underlying morbidities may amplify the risk of intensifying the severity of infection by the SARS-Cov-2. Warnings have come from authoritative sources such as the U.S. Center for Disease Control (CDC) [Ref.1]. Conditions cited by the CDC as increasing risk include cancer, chronic kidney disease, obesity, coronary disease, Type 2 diabetes mellitus, and sickle cell disease. CDC also warns that asthma, hypertension, and liver disease among others might subject a person to increased risk. One notes that sickle cell disease is most commonly found among persons whose ancestors come from Africa and Mediterranean countries where malaria is a prevalent affliction.
As many of the diseases cited by the CDC are more common in persons in late middle age and older, a warning common early in the course of the pandemic was that SARS-Cov-2 presented a particular danger to persons over 50 years old. Indeed, very early, as in the initial wave of cases in China [2] and the strong wave of cases in Italy, the probability of death due to covid-19 was judged to be a strong function of a patient’s age, being only a few percent for those under 50 and rising to nearly 20% for patients over 80. Certainly the large number of fatalities [3] in care homes seen in New York, the United Kingdom and elsewhere have fueled speculations about the potential of co-morbidities frequently seen in the elderly to make contracting covid-19 fatal.
Why is covid-19 more dangerous to the elderly than to younger persons? To complicate answering this question, the actual mortality rate of covid-19 is highly uncertain, as the prevalence of asymptomatic infections has been estimated to be 5 to 10 times more than infections with definitive symptoms. An exemplary source of testing-based data was provided by the passengers aboard the Princess Line cruise ship, the Diamond Princess on which half of the passengers who tested positive for covid-19 were asymptomatic. [4] To some degree that uncertainty might explain the very wide distributions of reported (or apparent) rate of mortality of covid-19 in countries ranging from < 0.03% (Singapore) to almost 30% (Yemen).
For a less anecdotal (and less speculative) assessment of risk factors for serious consequences of covid-19, a data-driven examination of national statistics seems to be in order with the goal of identifying strong correlations of mortality due to covid-19 with other potential co-morbidities. This manuscript presents a set of calculations of such correlations.
2. Methods of Analysis
From the outset one must keep in mind that the following analysis is not based on clinical or physiological considerations but on national epidemiological statistics. Unless otherwise indicated, the following assumptions underlie the subsequent calculations:
The apparent mortality outcomes are a viable proxy for actual rates of infection, death, and correlation with co-morbidities; we define
The apparent mortality and case number data used in the following analysis are accurate as of 30 August 2020.
The sample of 99 countries across all continents is representative of potential correlations between covid-19 mortality and potential co-morbidities. The number of covid-19 cases in the remaining countries is not statistically significant. However outliers with relatively small statistical significance can skew calculated correlations.
Linear correlations are examined on the basis of averaged national data. The sources that describe the prevalence of disease are from the World Health Organization1, Worldometer2, and for economic data the World Bank as reported by Trading Economics.3. This analysis assumes that the WHO data concerning the fatalities ascribed to diseases in a given country constitute valid proxies for the prevalence of those maladies in national populations. In the case of obesity, the reported number is the percentage of the population with a BMI exceeding a WHO established standard for a person of that sex.
The study examined the following factors:
Demographics: population, national median age;
SARS-Cov-2: number of covid-19 tests, confirmed cases of covid-19 as reported by government authorities, apparent mortality;
Other medical factors: incidence of flu, lung disease, asthma, obesity, heart disease, hypertension, diabetes, and malnutrition;
Economics: GDP-PPP, average household size, % population living in slums, health expenditures per capita, universal health coverage;
One random variable in the range from 0 to 100.
Examination of the data begins with looking for linear correlations between variables. The evaluation of the linear correlation herein uses the Pearson “product moment correlation” to evaluate linear relationships between data sets:
One may estimate the statistical significance of calculated correlations by computing r for two variables that are uncorrelated by construction; i.e., apparent covid-19 mortality and a random variable in the range from 1 to 100. Once linear correlations have been examined, the next step is evaluating cross-correlations among variables and performing a multivariate analysis.
The 99 countries sampled in this study were selected as having the largest number of reported covid-19 infections. The countries listed in Table 1 represent five regions, Americas, Asia, Europe, Africa, and Middle East plus Central Asia. Their combined population of nearly 5.5 billion accounts for the strong preponderance of all cases reported worldwide.
Figure 1 plots the two variables that are uncorrelated by construction. The calculated value of the Pearson coefficient for this set of 99 values is -3.4%.
A possible limitation of this approach is that all mortality data are given equal weight in the calculation of correlation. One check of whether this Ansatz introduces a bias is computing the correlation between apparent national mortality rates and national populations. Doing so yields a value of 0.56%. Another possible way to attribute a weighting that is not arbitrary is to plot the variation of covid-19 deaths per capita against the possible risk factor. However, the number of covid-19 deaths per capita depends strongly on national public health policies, on national efforts to prevent spread of the SARS-Cov-2 virus, on GDP and other considerations that are non-medical. The differences between Norway and Sweden are a case in point
3. Examination of linear correlations
To gain confidence in this statistical approach one can plot two variables for which one may expect to see a correlation (Figure 2). Here the value is quite high, 62.5%. Examining Fig. 2 more closely suggests a limitation of the method. The countries circled in red show a strong correlation while those in the green circle that show scarcely any correlation of a nation’s wealth with the age of its population. A refinement of the statistical approach is needed. By identifying the data underlying each point with each country’s region reveals that median age and national wealth are essentially uncorrelated for European nations but strongly correlation for countries in Africa and Asia. Regional grouping was thus adopted throughout this study.
An illustration of the utility of this refinement (Figure 4) is the relation between deaths per 100k of population due to malnutrition (deaths) as a function of national wealth. One may expect, and one does see a relatively strong (−45.5%) correlation driven by the high rates of malnutrition in Africa, Central America, and the poorer countries of Asia. Yet no such effect is apparent in European countries.
From the outset of the pandemic, national health authorities have warned the public about the increase risk of mortality for persons 60 years old and older. One can see in Figure 5 an example of the basis for such warnings in the data provided by the UK Office of National Statistics in September 2020 [5]. Again one asks why should the seriousness of covid-19 be a function of age?
From such data, one might expect a very strong correlation between the national apparent mortality rate and the median age of a country’s population. Even accepting the hypothesis of universality for the data of Figure 5, one should first multiply these rates by the distribution of a nation’s population normalized to the U.K. population grouped in the same age bins. That plot (Figure 6a) shows a surprising result. The low correlation, 12.9%, is even less pronounced when one examines the data region by region.
Instead of plotting mortality to covid-19 versus national median age, one might have examined the dependence on the percentage of the population of age 65 or greater (figure 6b). That correlation is somewhat larger (22.0%), consistent reference [5], but also the result of strong regional variations.
The national rate of confirmed cases of covid-19 with respect to the percentage of population older than 65 (Figure 7) displays a negative correlation (−14.8%) that is driven primarily by data from the Middle East. If one removes those countries with large numbers of young, foreign workers that correlation drops to -0.03%.
One may hypothesize that the “care home effect,” i.e., the large numbers of deaths seen in nursing homes in Italy, the U.K. and N.Y. was more the result of poor hygiene practices than by an extreme dependence of the lethality of covid-19 infections on specific underlying disorders. That hypothesis will be investigated in the following analysis.
The linear correlations of age with various potential causal factors shown in Figure 8 suggest candidates to examine to explain the “care home effect.”
In addition to particular underlying factors, the “care home effect” also reflects a generally very weakened physical condition of many occupants of care homes that would render any pneumonia-inducing disease lethal. Finally, the data of Figure 8 show no evidence that age alone influences the probability of a person becoming infected by the SARS-Cov-2 virus.
As suggested by the results in Figure 8, an example (Figure 9) illustrates the utility of the statistical approach used herein. In contrast with infections due to SARS-Cov-2, the incidence of death from influenza-induced pneumonia is highly correlated (−65.2%) with the median age of the population. The correlation also displays a strong regional dependence.
As the covid-19 often presents as a severe respiratory disease and strengthened by the results in Figure 9, one asks whether the seriousness of the covid-19 infections is correlated with incidence of asthma. As Figure 10 indicates, asthma neither increases the likelihood of SARS-Cov-2 infection nor does it seem to affect the seriousness of the disease in an infected patient.
In case of asthma, the contrast with influenza related pneumonia (Figure 11) is striking. The overall correlation of 46.2% is seen in all regions. Referring to covid-19 as a “flu-like” infection is definitely misleading.
Another early warning of the U.S. Centers for Disease Control was that obesity could represent a serious underlying factor that would lead to serious consequences of covid-19. However, once again actual the national data of (Figure 12) display essentially no (1.9%) correlation.
The contribution of obesity to the outcome of other pulmonary disorders is significantly different as is displayed in Figure 13. Curiously, obesity has a significant correlation (nearly 40%) with the risk of contracting infection from SARS-Cov-2, although not with the apparent outcome of the infection. The observation of increased risk of infection (although not its outcome) has been previously reported in [6]. Unlike this study, reference [6] reports increased risk of infection (32.9%) for people with chronic kidney disease.
One might speculate that as a chronic respiratory disorder involving the airways in the lungs, asthma may increase the seriousness of consequences of covid-19 and its induced pneumonias, but Figure 14 shows no such significant correlation (2.6%). Examining the correlation of covid-19 mortality with other lung diseases (Figure 15) also shows minimal if any correlation (2.5%). In contrast, the relationship of influenza-induced pneumonias with asthma and other lung diseases presents a correlation that is quite high, 61.0% and 34.8% respectively. With respect to their effects on patients with underlying conditions, influenza and covid-19 are very different diseases.
An early warning to persons with underlying conditions concerned diabetes mellitus. That suspicion is echoed by the strong dependence with age shown in Figure 8. Whether one measures the incidence of diabetes by deaths due to diabetes or to the reported national rates of diabetes in adults (20 to 79 years of age), the correlation with covid-19 mortality is similarly low (∼11%). In otherwise healthy persons, diabetes does not appear to be a significant risk factor with respect to the serious of infection by SARS-Cov-2. Figure 16 and Table 2 summarize the linear correlations of apparent covid-19 mortality with several underlying medical and economic conditions considered herein.
As the shown in Figure 8 most strongly correlated with age correlated at best weakly with covid-19 mortality, one may surmise that poor health care management played a very large role in the “care home effect.”
4. Cross-correlations and multivariate analysis
Before investigating cross-correlations as a way of searching for root causes, one might want to perform a multivariate analysis of covid-19 mortality against a common trio of risk factors commonly found together in patients in nursing and convalescent homes–namely Diabetes mellitus, Hypertension, and Coronary disease (DHC). For that trio the multiple correlation coefficient is 17.1%, not negligible but unlikely to be the root cause of the “care home effect.” A similar computation the correlation of DHC for deaths due to influenza and its associated pneumonia yields a stronger correlation of 35.9%. Replacing hypertension with asthma in the DHC trio reduces the coefficient of multivariate correlation for covid-19 mortality to 12.1%. The analogous calculation for influenza increases the multiple correlation coefficient to 62.7%. This multivariate analysis shows once again that influenza and covid-19 are very different diseases. The results are summarized in Table 3.
Other calculations of multivariate correlations with the apparent national mortality rates of covid-19 are presented in Table 4.
4.1. Cross-correlations
As the previous section argues and as Figure 17 illustrates, the contrast with covid-19 in the correlations of influenza/pneumonia with other potential underlying conditions is striking.
An examination of cross-correlations (Figure 18) is best displayed in plots ordered in strength of influence of given conditions considered for their possible correlation with the outcome of SARS-Cov-2 infections. In order of presentation these are national median age (Figure 8), obesity (Figure 13), asthma (Figure 18), and diabetes mellitus (Figure 22).
Although obesity appears correlated with SARS-Cov-2 contagion, it appears uncorrelated with the outcome of covid-19 infections. Figure 13 shows that lack of such correlation does not appear with respect to influenza, malnutrition and asthma, although in those three cases the coefficient is negative. Understanding the correlations of obesity calls for a deeper look at the relationship of obesity with the conditions that show the most influence. Already in the case of contagion, regional differences make for a substantial fraction of the apparent effect (Figure19).
These regional differences could be due to factors such as national median age (Figure 20) or it may be influenced by national wealth reckoned in terms of per capita GDP-PPP (Figure 23).
Figures 19 through 21 do not resolve the means by which obesity may influence contagion of covid-19, but they do illustrate why cross-correlations are important to examine.
As is the case with asthma, diabetes mellitus shows (Figure 22) significant correlations with several medical and economic conditions such as age, household size and mortality die to influenza/pneumonia. But once again no correlation with covid-19 mortality is evident.
5. Factors related to national economics and public health policies
The differences in the magnitude, outcomes, and characteristics of waves of infections among sub-national regions with roughly equivalent medical factors indicates that economics and public health policies makes a significant difference in the severity of SARS-Cov-2 infections. This section examines dependencies on GDP-PPP, average household size, percentage of population living in slums, percentage of urban population, health expenditures per capita, and the WHO Universal Health Coverage (UHC) index.
Figure 4 has already shown an example of economic impact on medical outcomes; the per capita GDP (after correction for purchasing power) has a strong influence (−44.6%) on the rate of deaths due to malnutrition. That observation is hardly surprising in any way. One may ask the same question with respect to mortality due to covid-19 infections. The plot of Figure 5 shows essentially no correlation (1.4%) of covid-19 mortality with national wealth; the politics of poverty does not explain the observed national rates of covid-19 mortality.
The influence on contagion of SARS-Cov-2 over the entire data set is noticeable and positive (29.4%). As shown in Figure 24 a that value is entirely driven by strong dependence of rising contagion with rising income in African countries. If one removes the African countries from the sample, the correlation disappears (2.8%).
One could hypothesize that the strange behavior in Africa in Figure 24a is due to the increase in urbanization with increasing national wealth. One might further suspect that the increase in urbanization is also likely to increase the fraction of the national population living in slums. In fact, taken together Figure 24b and Figure 25 show that both of those suppositions are consistent with the data.
The correlation of economic and policy factors with contagion (measured in confirmed covid-19 cases per 1M of population and apparent covid-19 mortality is presented in Table 5. As the mortality rate varies in time and seems to decline as the pandemic progresses (at least in the Northern Hemisphere) the mortality rate has been benchmarked as of August 30, 2020.
The surprising negative correlation in contagion with the percentage of the urban population living in slums is due to the trend in Africa that the smaller the fraction of a nation’s population living in cities, the more likely it is that they live in slums. That characteristic is displayed in Figure 25. The correlation level with respect to GDP is explained by the correlation of GDP-PPP with percentage of population over the age 65. The substantial correlation of contagion with testing is the result of the obvious fact that the more one looks, the more one sees. The correlation of contagion with percentage of urban population is due to the cross-correlation of GDP with percentage of urban population (64.8%) and the high cross-correlation of urban population with testing for covid-19 (49.7%). The values for average health care expenditures and the UHC index of the WHO are similarly explained. The data that underlie the value of mortality versus the percentage of the urban population that live in slums appears in Figure 26.
6. Summary and conclusions
This statistical study covering countries with ∼70% of the world’s population confirms the early clinical observation that infection by the SARS-Cov-2 virus presents a great risk to persons over the age of 65. However, it does not support the suggestions presented by government agencies early in the pandemic that the risks are much greater for persons with certain common potential co-morbidities. Many of the early deaths of elderly patients early in the course of the pandemic took place in circumstances that likely promoted rather than impeded the spread of the virus among person who were generally in a poor state of health.
A commonly heard claim by persons who object to strict measures to prevent the spread of the SARS-Cov-2 virus has been that the resulting disease is similar to influenza and should be treated in the same manner as influenza as a matter of public policy. The comparison of the severity of medical outcomes of covid-19 with those caused influenza strains and their resulting pneumonias displays dramatic differences. Promulgating the idea that covid-19 a “flu-like disease” spreads gross misinformation to the detriment of the public health worldwide.
One may ask what governmental actions can reduce the seriousness of infections by SARS-Cov-2. A comparison of the cases of Germany and Italy may be instructive in this regard. The two countries have similar numbers of confirmed cases of covid-19; yet the death rate in Italy is roughly triple that in Germany. Germany put in place and extensive number of triage and early treatment centers outside of hospitals; it also moved quickly to secure adequate amounts of personal protective equipment. Infected patients were identified early in the course of the disease and were treated in a manner that did not overwhelm the central intensive care facilities in hospitals as happened in the Italian region of Lombardia.
Perhaps a similar lesson comes from comparing the experience in the United States in California and New York. The early lockdown in California more than doubled the first duration of the first wave of infections as compared with New York leading to 60% more cases in California yet half the death rate of New York in which medical resources were badly stressed.
Presently authoritative data on a worldwide country-to-country basis are not available to evaluate the effectiveness of prevention and treatment modalities.
Data Availability
All data in this article are in the public domain
Acknowledgements
The author acknowledges his colleagues in the World Federation of Scientists for their encouragement to continue, expand, and report this research. The author’s work is completely self-supported without any outside funding.
Footnotes
↵1 World Health Rankings, https://www.worldlifeexpectancy.com/world-health-rankings
↵2 Worldometer, https://www.worldometers.info/
↵3 Trading Economics, https://tradingeconomics.com/
↵4 The GDP-PPP has a several percent systematic uncertainty that depends on the economic model used in making the PPP corrections.