## Abstract

**Background** COVID-19 was first identified in Wuhan, the capital city of the province of Hubei in China. Due to the presentation of multiple symptoms at the same time, it is clinically important to understand the probability of dying from COVID-19 vs. the probability of dying from other causes.

**Methods** Using data collected in Hubei that identified by age those who died of COVID-19 or its sequelae among the infected, we constructed a life table showing the conditional probability of dying at age x from COVID-19 and its sequela among those infected. Following the relative survival perspective, we also computed corresponding data for China that matched the format of the life table we constructed from the Hubei study. We then formed ratios of the 10-year conditional portability of dying at age x from COVID-19 for the Hubei COVID-19 victims to the ten-year conditional probability of dying at age x from all non-COVID-19 causes for those not infected by COVID-19 in China as a whole.

**Findings** At every age, the conditional probability of dying from COVID-19 among those infected in Hubei is higher than the conditional probability of dying from all non-COVID-19 causes for China as a whole. Following a general age-related mortality pattern, the conditional probability of dying from COVID-19 from age 20 onward increases monotonically for those who are infected. Relative to the probability of dying in China from all other causes for those not infected, however, it declines monotonically from age 20 to age 70.

**Interpretation** At younger ages the relative conditional probability of dying from CVOD-19 among the infected is substantially higher than it is for those infected who dying of all other causes and while staying higher at all ages, it declines monotonically with age. The monotonic decline in the ratio from age 20 to age 70 is a result of the age-related increase in the probability of dying from one or more of a number of competing causes, which, in the case at hand is manifested in the fact that non-COVID-19 deaths in China among the uninfected were generally increasing at a faster age-related rate than were the COVID-19 deaths to the infected in Hubei.

## Introduction

As of January 19^{th}, 2022 the total worldwide COVID-19 related deaths from the beginning of the pandemic in 2019 are around 5.55 million and total infections are around 334 million as of January 19^{th}, 2022 (Ritchie et al., 2022). Some researchers have measured the impact of these excess deaths on population life expectancy, while others argued that life expectancy is reduced due to COVID-19 deaths (Trias-Llimós and Bilal 2020, Andrasfay and Goldman 2021, Castro et al., 2021, Islam et al., 2021; Venkataramani, O’Brien and Tsai 2021; Woolf, 2021). In general, the approach considered in these studies is one of the commonly accepted techniques using period life tables (Chan, Cheng, and Martin 2021, Yusuf, Martins, and Swanson 2014). However, the middle and older-aged excess deaths during 2020-21 influence the computation of life expectancy of newborn babies of 2020-21 in all of them. In essence, these studies assume that the current mortality rates among adults will have the same pattern when a current newborn reaches adulthood, and, accordingly, survival probabilities were computed. However, a newborn in 2020 need not be influenced by COVID-19 excess adult deaths during 2020-21 because these deaths may not occur every year for the next 20-30 years (Rao, Swanson, Krantz, 2022). This problem can be avoided in several ways. One is to examine Person-years of Life Lost (Andersen, Canudas-Romo and Keiding, 2013; Pifarré i Arolas et al. 2021, Swanson et al. 2022), another is to compute the loss of expected remaining life (Goldstein and Lee 2020, Heuveline 2021, Rao, Swanson, Krantz (2022)). A third approach is to compare the probability of dying by age of those who have contracted COVID-19 to those who have not been subject to the COVID-10 pandemic, a “relative survival” technique (Stroup et al., 2014). In this paper we take this latter approach in the form of a case study of COVID-19 victims in Hubei, China.

## Background

Hubei is a province in China of 72,400 square miles and a population in excess of 57 million (Britannica, no date). COVID-19 symptoms such as viral sore throat, cough, pneumonia, fever, myalgia and fatigue were first clinically seen among hospitalized patients in its capital, Wuhan, during December 2019 (Chen et al., 2020). The first clinical studies were conducted at hospitals in Wuhan after ruling out common bacterial and viral pathogens that cause pneumonia (Huang et al. (2020). Clinical information such as chest images, complete blood count, sputum tests, coagulation profile, and liver tests were conducted. Due to multiple symptoms presenting at the same time, understanding the relative survival of diagnosed COVID-19 patients became clinically important (Stroup et al., 2014). In January 2020, the SARS-CoV-2 virus was officially confirmed and quarantine zones were implemented across the province, especially in the city of Wuhan, which was identified as the epicenter of outbreak (Verity et al. 2020). Given this history, it is not surprising that one of the first collections of death data by age for COVID-19 victims was collected in Hubei Province, a subject to which we now turn.

## Data and Methods

In this study we employ the COVID-19 death data by age for Hubei Province taken from Table 1 in Verity et al. (2020). The study population was confirmed to be COVID-19 positive and the deaths are directly due to COVID-19 and its immediate sequelae. The death and population data were collected such that each subject had less than one year of exposure. No adjustment was made for this so the assumption is that each subject had one year of exposure. The data are subject to right censoring and other issues. For details see Verity et al. (2020). The data are presented in a manner consistent with a clinical study, which along with the sample size (n= 44,672, with 1,023 recorded deaths by age) makes it feasible to construct a life table (Swanson, 2021; Swanson, Chow, and Bryan, 2020), a rarity so far among COVID-19 studies.

From these data, the life table shown as Table 1 was constructed using Fergany’s Method (Fergany 1971). As can be seen in Table 1. Fergany’s (1971) method is advantageous because only the age-specific death rates are needed to construct an abridged life table. “In addition to its simplicity, it is, in contrast to other methods, self-contained in the sense that beyond making only the assumption of approximating the force of mortality by a step function (which is all we observe anyway) no further assumptions, approximations, or parameter estimates are required to compute all the life table functions.” (Fergany 1971: 334).

From a 2016 abridged life table for China (Appendix Table 1) that was by gender (National Bureau of Statistics of China, no date), we computed the weighted number of survivors at age x (l_{x}) for both genders combined (Kintner, 2004: 318-319) using the 2020 population age distribution by gender taken from the National Bureau of Statistics of China as provided by Statistica (no date). We then found the l_{x} values at 20, 30,…70, and 80 for both genders combined (also known as “all sexes”) and computed the conditional probabilities of dying (_{10}q_{x}) in the next ten year period given that one had lived to age x. These are shown in Table 1. Because everybody who reached age 80 dies in the subsequent “open age” interval, the probability of dying is set to 1.00 at age 80.

The appendix shows the mathematics underlying the conditional probability of dying and surviving from one age to the next.

## Results

Starting at age 20, Table 2 shows the _{10}q_{x} values for the Hubei COVID-19 victims in 2020 and the _{10}q_{x} values derived from the 2016 life table for China. Figure 1 displays these same results in graphic form. Figure 3 shows the ratio of these same _{10}q_{x} values in graphic form.

At age 20, the probability of a person with Covid-19 dying from it (and no other causes) in Hubei before reaching age 30 is 4.27 times higher than the probability of a person aged 20 in China dying before reaching age 30 of any other cause other than Covid-19, which is 0.00449; at age 30, the probability is 3.35 times higher, age 40 it is 2.87 times higher, age 50, 2.80 times higher; and at age 60 the probability of a person with Covid-19 dying of it before reaching age 70 in Hubei is 1.63 times higher than the probability of a person age 70 in China dying of any cause other than Covid-19 before reaching age 80, which is 0.33607.

Given that the life table for China as a whole applies to Hubei Province, Table 2 shows the increased mortality risk generated by COVID-19 on Hubei’s population. Looking in the other direction, Table 2 shows the increased mortality risk generated by COVID-19 on China’s population given that the Hubei COVID-19 life table applies to China as a whole.

## Discussion

Table 2 shows that those aged 20 in Hubei Province who contracted COVID-19 were at a higher probability of dying from it than were non-infected people in China aged 20 dying from all other non-COVID-19 causes. This probability increases monotonically with age, but declines relative to the probability for those in China not exposed to COVID-19. As such, beyond age 50 it appears to be more deadly for them than was smallpox in the 18^{th} and 19^{th} centuries when Bernoulli presented a strong case that among those who are infected, about 12 percent of them will die in any given age group (Gehrman, 2014). Another way to look at the risk of dying is to construct a survival model. Using the data in Table 1 for the population aged 20 and over, we constructed a Gompertz model from the deaths found in column 7 using the survival analysis procedure found in the NCSS software package, release 12 (NCSS 2020). Exhibit 1 provides the details of the results.

(EXHIBIT 1 ABOUT HERE)

With a coefficient of determination equal to 0.95 and a slope parameter of 0.1273, we are presented with a well-fit model in which the “actuarial ageing rate” is very high - the rate of dying increases very rapidly with age (Kirkwood, 2014: 1). These comparisons and the ratios found in the last column of Table 2 are telling: Once infected, there is a high probability of dying that increases exponentially with age.

The Hubei data are not a random sample, which usually would mean that classical inferential statistics such as confidence intervals and margins of error are not appropriate. However, we can take the “superpopulation” perspective (Deming, 1953; Godambe and Thompson, 1986. Swanson and Beck, 1994; Swanson and Tayman, 2012: 171-173; Swanson, Tayman, and Beck 1995), which provides an alternative framework for inference, an approach often found in wildlife studies, where scientific sampling is extremely difficult (e.g., Durkin and Cohen, 2019; Wen et al., 2011). While not perfect, this perspective does provide some idea of the uncertainty affecting the Hubei _{10}q_{x} values. With the superpopulation perspective in mind, we use an approach developed by Chiang (1960) to generate uncertainty measures. Starting with the unbiased estimate of probability of death q^ = D/ N, where the number of deaths is denoted by D within a given age interval and N is the number at the beginning of the interval, Chiang (196)) showed that the standard deviation can be estimated as sqrt(q^*(1- q^))*N, which leads to the standard error (se) of death in the interval in question of sqrt(q^*(1- q^)/N). Translated into the terms we use in Table 1 and elsewhere in this paper, we have se = sqrt(_{10}q_{x}*(1-_{10}q_{x})/k), where k = number alive at age x. Table 3 provides the standard errors and 95% confidence intervals for the _{10}q_{x} values of the Hubei COVID-19 victims (Table 1). For age zero, there are no reported deaths so a confidence interval is a moot point as is the case for age 80 and over where the probability of dying is 1.00. For the other age groups, the estimated standard errors range from 0.00173 for age group 30-39 to 0.00795 for age group 70-79. Even where the standard error is the highest, the 95% confidence interval for age group 70-79 is relatively narrow, from 0.530 to 0.565. If nothing else, these results show the impact that a large sample can have in the reduction of statistical uncertainty.

## Concluding Remarks

An immediate question is how well the Hubei _{10}q_{x} values apply elsewhere. As a starting point, we note that Verity et al. (2020: 669) found that the case fatality ratios they estimated by age from the Hubei data were consistent with those international cases stratified by age. This consistency suggests that the COVID-19 life table (Table 1) may apply beyond Hubei Province. Another consistency is that the conditional probability of dying from COVID-19 from age 20 onward increases monotonically for those who are infected, which follows a general age-related mortality pattern (Kintner, 2004; Yusuf, Martins, and Swanson, 2014). Relative to the probability of dying in China from all other causes for those not infected, however, it declines with age, as is seen in the last column of Table 2. This decline is a result of the age-related increase in the probability of dying from one or more of a number of competing causes, which, in the case at hand is manifested in the fact that non-COVID-19 deaths in China among the uninfected were generally increasing at a faster age-related rate than were the COVID-19 deaths to the infected in Hubei.

If one is interested in comparing Hubei to a country other than China, one should start by comparing the _{10}q_{x} values for China shown in Table 2 with those for the country in question. If they have similar _{10}q_{x} values, the overall pattern will hold in that there will be a decline in the ratio in going from 20 to age 70 (for the reasons just stated), but it may not be as steep as the decline in the ratio seen for China, which would be the case with a country that has higher _{10}q_{x} values than China at the younger ages and lower ones at the older ages. The pattern also may not be monotonic, which would be the case for a country that had the same _{10}q_{x} values as China at every age except 50, where it was substantially lower, which would result in a “bump” in the trend at age 50.

As noted by Verity et al. (2020) and Tian et al, (202) initial analysis of the survival data of Wuhan needs revision after the epidemic unfolds. In the meantime, we trust that the results presented here will assist hospital and clinical settings in long-term and current management.

## Data Availability

All found in the paper or at publicly available online sources

## Acknowledgments

None

## Appendix

Let *l (x, t)* be the infected individual at age *x* and at time *t*. The probability of survival of infected who will live for *k* time units since acquiring the infection at *t* say, _{k}_{p(x, t)}, is computed as
and probability of survival for all infected at age *x* to survival until *k* is
The probability of survival for all infected during (*t, t* + *k*) will be
Then, the probability of dying of an individual who is at age *x* to die within (*t, t* + *k*) will be
The decadal probability of dying from (4) would be

## Footnotes

Email: d-poston{at}tamu.edu

Email: sk{at}math.wustl.edu

Email: arni.rao2020{at}gmail.com (or) arrao{at}augusta.edu

**Declarations:**Availability of data and materials: All found in the paper or at publicly available online sources cited in the paper

Competing interests: None

Funding: U.S. National Science Foundation for funding through the project DMS-2140493.