## Abstract

Several studies have found that black patients are more likely than white patients to test positive for or be hospitalized with COVID-19, but many of these same studies have found no difference in in-hospital mortality. These studies may have underestimated racial differences due to reliance on data from a single hospital system, as adequate control of patient-level characteristics requires aggregation of highly granular data from several institutions. Further, one factor thought to contribute to disparities in health outcomes by race is site of care. Several differences between black and white patient populations, such as access to care and referral patterns among clinicians, can lead to patients of different races largely attending different hospitals. We sought to develop a method that could study the potential association between attending hospital and racial disparity in mortality for COVID-19 patients without requiring patient-level data sharing among collaborating institutions. We propose a novel application of a distributed algorithm for generalized linear mixed modeling (GLMM) to perform counterfactual modeling and investigate the role of hospital in differences in COVID-19 mortality by race. Our counterfactual modeling approach uses simulation to randomly assign black patients to hospitals in the same distribution as those attended by white patients, quantifying the difference between observed mortality rates and simulated mortality risk following random hospital assignment. To illustrate our method, we perform a proof-of-concept analysis using data from four hospitals within the OneFlorida Clinical Research Consortium. Our approach can be used by investigators from several institutions to study the impact of admitting hospital on COVID-19 mortality, a critical step in addressing systemic racism in modern healthcare.

## Introduction

Throughout the COVID-19 pandemic, several studies have found that black patients are more likely than white patients to test positive for or be hospitalized with COVID-19 [1-5]. Many of these studies also found that there was no difference in in-hospital mortality for black and white patients after adjusting for patient-level sociodemographic and clinical characteristics [1-2, 4-5]. However, those studies may have underestimated racial differences because they relied on data from single hospital systems, which might have more homogeneous and less broadly representative populations. A much larger study reflecting data from approximately 1,200 US hospitals found residual racial differences [10]. That study was possible because of a single data source from a large insurance company, but many questions cannot be answered without data that are already aggregated and in a sufficiently granular level.

The challenges of aggregating highly granular data are critically relevant to investigations of racial disparities in health care because of the frequent need to control for patient-level characteristics. Those statistical adjustments often reveal mediating pathways because individual characteristics that seemingly confound associations by race might themselves be the products of past racial injustice. For example, making statistical adjustments for insurance type, area of residence, and comorbidities might seem to allow estimation of isolated effects of race, but if those factors are themselves the products of racial discrimination those adjustments can risk obscuring racial differences in health outcomes rather than identifying them.

Indeed, one factor thought to contribute to disparities in health outcomes by race is site of care [6]. Patients of different races tend to live in different areas and so their sources of care and referral patterns tend to differ. For example, a study examining hospital-level racial disparities in acute myocardial infarction treatment and outcomes found that black patients tend to receive care at different hospitals than white patients [7]. Another study found that black and white patients tend to be treated by different physicians, with physicians for black patients possibly less qualified and lacking access to critical clinical resources relative to physicians who tend to treat white patients [8]. These reported differences are likely associated with a number of differences in the black and white patient populations, including access to care, referral patterns among clinicians, and susceptibility to harmful social and environmental exposures. Repeatedly, racial differences in health outcomes have been attributed to differences in care site.

To investigate the potential association between admitting hospital and racial disparity in mortality for COVID-19 patients, a counterfactual modeling approach can be used [9]. If we assume that black and white patients each attend hospitals according to some underlying distribution, then using counterfactual modeling, we can simulate hospital assignments for black patients under the hypothetical scenario that they attend hospitals in the same distribution as white patients. Assuming the existence of hospital-level effects affecting in-hospital mortality, one may hypothesize that simulated mortality rate under this scenario would be lower than observed mortality rate for black patients if one assumes hospitals with predominately white patients have higher quality of care on average than hospitals with mostly black patients. This counterfactual modeling approach can aid in unveiling disparities in health outcomes associated with site of care, a crucial first step in addressing systemic racism in modern healthcare.

A recent study by Asch et al. (2020) explored the potential connection between admitting hospital and racial disparities in COVID-19 using counterfactual modeling, fitting a generalized linear mixed model (GLMM) to model log odds of mortality while adjusting for both common patient-level fixed effects as well as hospital-specific fixed and random effects [10]. Estimation of hospital-specific effects is the key component for counterfactual modeling, allowing for estimating patient-specific mortality risk as if the patient (counterfactually) attended a hospital different from the one they truly attended. While effective for this particular study, which featured a centralized data repository allowing direct access to all patient data, GLMM is not able to be used if data are not centralized. Patient-level data sharing among hospitals is often not possible due to regulations protecting patient privacy. If each participating institution can instead share only aggregate data, a method allowing for distributed estimation of GLMM is necessary.

In this work, we propose a novel application of a recently developed algorithm for performing GLMM estimation [11] to study hospital-associated racial disparity in COVID-19 mortality via counterfactual modeling. We refer to this approach as the distributed penalized quasi-likelihood (dPQL) algorithm. The dPQL is based on the penalized quasi-likelihood (PQL) algorithm, an iterative procedure for GLMM estimation [12-13] that has commonly been used to fit GLMM due to its ease of computation; it usually achieves convergence in several iterations. The dPQL algorithm is a distributed version of PQL, where in each iteration it requires participating hospitals to share aggregate, summary-level data rather than patient-level data. The result of the dPQL algorithm is lossless, as fixed-effect and random-effect estimates are identical to those produced by traditional PQL as if one had access to centralized patient-level data. After estimating the fixed and random effects via the distributed algorithm, individual hospitals within a multi-site study can further estimate counterfactual mortality risk for each of their own patients, quantifying the effect of their patients receiving care at another hospital in the study. Counterfactual mortality risk, as will be demonstrated, can also be obtained in a distributed manner (without sharing patient-level data). Thus, the above distributed analysis framework creates the potential for large-scale, multi-site studies for assessing hospital-associated racial disparities when sharing patient-level data is not possible.

The remainder of this work is structured as follows. In the Methods section, we first provide an overview of how to estimate patient-specific mortality risk at a given hospital and propose using the dPQL algorithm for these purposes. We then describe estimation of counterfactual mortality risks for each patient within each hospital before outlining a simulation approach which can be used to investigate hospital-associated racial disparities in COVID-19 mortality vis counterfactual modeling. Lastly, we offer an illustrative example of how our proposed application of the dPQL algorithm can be used in practice, using it to analyze patient data from four hospitals within the OneFlorida Clinical Research Consortium. In our analysis, we sought to determine whether observed 30-day in-hospital mortality rate for Non-Hispanic Black (NHB) patients differed from simulated overall mortality risk, calculated under the hypothetical scenario that NHB patients attended hospitals in the same distribution as Non-Hispanic White (NHW) patients. We present results for our proof-of-concept simulation before concluding with a discussion on the implications of our work.

## Methods

### Modeling Mortality Risk: Generalized Linear Mixed Model

Investigating the association between attending hospital and racial disparities in COVID-19 mortality via counterfactual modeling requires a statistical modeling approach that accounts for hospital-specific effects and is compatible with the data sharing arrangement in place among participating hospitals. Suppose *H* hospitals, each with *n*_{h} patients, are willing to participate in a multi-site analysis. Each hospital has direct access to its own relevant patient-level or hospital-level covariates *X*_{ih} for each patient *i* at hospital and outcome of interest *y*_{ih} for each patien In the context of our study question, suppose this outcome is binary. If hospitals agree to share patient-level data across hospitals, allowing for a centralized data repository, a generalized linear mixed modeling (GLMM) approach can be implemented using the pooled patient-level data. The mean and variance using GLMM are specified as
and
where *g* = *h*^{-1} is the link function that connects the conditional patient-level means *μ*_{ij} to the linear predictor *η*_{ij} and *v*(·) is the variance function. The random effects *α*_{h} are assumed to follow a normal distribution with mean 0 and variance θ. In our context, the GLMM method allows for modeling common fixed effects ** β** (

*β*

_{1},

*β*

_{2}, …,

*β*

_{p}) for

*p*covariates at either the patient or hospital level as well as hospital-specific random effects

**, such as a random intercept estimated for each hospital (**

*α**α*

_{1},

*α*

_{2}, …,

*α*

_{H}). In our modeling below, hospital-specific random intercepts are the only random effects modeled. Incorporating additional hospital-specific random effects is possible if desired. Estimated fixed and random effects, and , respectively, can then be used to estimate mortality risk for each patient in the sample.

### Distributed Penalized Quasi-Likelihood (dPQL) algorithm

In settings where patient-level data cannot be shared, collaborating hospitals in a multi-site study often agree to conduct a distributed data analysis. When analyzing data distributively, patient-level data remain within institution, with each hospital analyzing its own data directly before sending aggregate results to a coordinating center. Available methods for performing distributed data analysis vary in their accuracy relative to pooled analysis methods and communication required among participating institutions to obtain results. Most distributed analysis methods require a lead site to coordinate communication among the collaborating sites and aggregate hospital-specific results.

We propose using a recently developed distributed penalized quasi-likelihood (dPQL) algorithm [11] for performing distributed GLMM. Estimation via penalized quasi-likelihood (PQL) is one of several possible approaches for fitting a GLMM. The PQL algorithm iteratively fits the linear mixed model (LMM) with the working outcome and the weight The obtained effect and variance estimates are denoted as . The dPQL algorithm is a distributed version of PQL, where in each iteration participating hospitals are required to share only aggregate data with the lead site to fit the LMM. This is done by using a distributed linear mixed model (DLMM) algorithm [14]. Since the DLMM algorithm is lossless, resulting in estimates identical to those produced by the LMM, the dPQL algorithm also obtains the same estimates as those obtained by the PQL algorithm as if all patient-level data are available. The dPQL algorithm is summarized below.

### The dPQL algorithm

1. Initialization: The lead site sends an initial value for the fixed effects

*β*^{(0)}and random effects*α*^{(0)}= 0 to all participating sites*h*= 1,*2*, …,*H*(including the lead site).2. For iteration

*s*= 0, 1, …,*S*and within each site*h*= 1,*2*, …,*H*:2.1. Site

*h*calculates the working outcome , linear predictor*η*_{h}^{(s)}=*X*_{h}*β*^{(s)}+*α*_{h}^{(s)}, and weights*W*_{h}=*diag*{*g*′(*μ*_{h}^{(s)})^{-2}*v*(*μ*_{h}^{(s)})}.2.2. Site

*h*calculates the following aggregated data, communicating each measure to the lead site:-

*p*×*p*matrix:-

*p*-dimensional vector:- scalars: and sample size

*n*_{h}

2.3. Lead site fits weighted DLMM algorithm based on aggregated data from 2.2. to obtain updated

*β*^{(s+1)}and*α*^{(s+2)}, returning these to the collaborating sites to replace initial values.

3. 2.1. through 2.3. are repeated until convergence, e.g. ||

*η*^{(s+1)}−*η*^{(s)}‖ / ‖*η*^{(s)}‖ < 1e-6. The final estimates are and , all obtained in the final iteration*S*.

### Distributed Counterfactual Modeling

Given the distributively estimated fixed effect and random effects for hospital h, for patient *i* who attended hospital *h*, their estimated mortality risk can be calculated as
We refer to this as the factual mortality risk for patient *i*. Our primary interest for this analysis is to investigate the association between attending hospital and in-hospital mortality differences for Non-Hispanic Black (NHB) and Non-Hispanic White (NHW) patients. To do this, we use a counterfactual modeling approach to study whether the observed in-hospital mortality rate for NHB patients differs from their simulated in-hospital mortality rate given they (hypothetically) attended hospitals in the same distribution as NHW patients [9]. Counterfactual mortality risk estimates differ from the factual mortality risk estimates defined above in that they are purely hypothetical, with estimates calculated using an estimated random intercept for a hospital that a given patient did not truly attend. The difference between factual and counterfactual mortality risk estimates and how they are calculated are detailed in Figure 1. While denotes the factual mortality risk estimate for patient *i* at hospital *h*, we can calculate counterfactual mortality risk estimates for all patients as if they attended any of the hospitals participating in the study, denoting counterfactual mortality estimates using where *h* ≠ *h*^{*}. Using this notation, *h* denotes the hypothetical attending hospital, the hospital contributing the estimated random intercept, while *h*^{*} denotes the hospital that patient truly attended.

We next quantify racial disparity by conducting a simulation to investigate the counterfactual mortality rate of black patients had they been admitted to hospitals in the same distribution as white patients while retaining their sociodemographic and clinical characteristics.

Our simulation procedure is depicted at a high level in Figure 2. Suppose the number of black patients at each hospital *h, h* = 1, …, *H*, is *n*_{hB}. Each hospital shares the number of white patients at their respective hospitals, *n*_{hW} = *n*_{1W}, *n*_{2W}, …, *n*_{HW}, with the coordinating center. These are used to determine the relative proportion of total white patients at each hospital, *ŵ*_{h} = *ŵ*_{1},*ŵ*_{2}, …, *ŵ*_{H}, which are communicated to each hospital. Next, for replicate *s* = 1, …, *S* of the simulation within hospital *h*, a multinomial distribution with probabilities equal to *ŵ*_{1},*ŵ*_{2}, …, *ŵ*_{H}, is used to assign each black patient to a hospital in the study. Assume patient *i* is assigned to hospital *h*^{*}. Hospital assignments may be factual if *h* = *h*^{*} or counterfactual if *h* ≠ *h*^{*}. Using these hospital assignments, (counter)factual mortality risk estimates are calculated for each patient *i* as .

To preserve patient privacy, individual patient mortality risk estimates are averaged within hospital *h* as
and communicated to the coordinating site. Simulated overall mortality risk is then calculated using the following:
Since we are interested in whether hypothetical hospital assignment affects mortality risk for black patients, in each iteration of the simulation, we calculate the difference between observed black patient mortality rate and simulated black patient mortality risk . The mean difference across iterations can be reported along with an empirical 95% confidence interval (using the 2.5^{th} and 97.5^{th} percentiles of ) to quantify uncertainty in the resulting difference.

### Illustrative Example Using Multi-Hospital Real-World Data

We illustrate our application of the dPQL algorithm to study hospital-associated racial disparities in COVID-19 mortality using data from the OneFlorida Clinical Research Consortium, a centralized data repository comprising data for over 74% of Floridians.

OneFlorida data include records from Medicaid and Medicare claims, cancer registries, and electronic health records from various clinical partners in the state of Florida. As a proof-of-concept analysis, we conducted a counterfactual modeling simulation using patient data from four hospitals in the OneFlorida Clinical Research Consortium. Patients included in the analysis were required to be hospitalized with COVID-19 and have index hospitalization dates between 3/1/20 and 2/28/21 to ensure 30 days of follow up for each patient. Using the dPQL algorithm, we distributively modeled the log odds of 30-day COVID-19 in-hospital mortality as a function of various patient characteristics, including age, gender, a collection of nine comorbidities, and index quarter, defined as one of four three-month intervals when a given patient was admitted. Definitions for each quarter, as well as further details concerning the included comorbidities and descriptive statistics for each hospital, are presented in Table 1.

Our primary analysis in this proof-of-concept study concerned calculating the difference between observed mortality rates and average simulated mortality risk estimates for all non-Hispanic black (NHB) patients across the four hospitals. We also performed a series of secondary analyses stratified by index quarter; four sub-analyses were conducted, with each only including patients with an index date for COVID-19 hospitalization in a particular quarter. Due to the rapid changes in the COVID-19 landscape as the pandemic progressed, including potential improvements in patient care and increasing prevalence of rare variants, we were interested in exploring whether the difference between observed and simulated mortality varied by quarter and if a trend in either direction was evident.

In Figure 3, we depict the relative distributions of non-Hispanic white (NHW) and NHB patients across the four hospitals, both within specific quarters and overall. Overall and in each quarter except for Quarter 1, Hospital A was the most well-attended, with the percentage of NHW patients attending Hospital A ranging from 38% in Quarter 1 to 52.5% in Quarter 3. For NHB patients, Hospital B was attended most overall and in each quarter except for Quarter 3. The percentage of NHB patients attending Hospital B ranged from 31.7% in Quarter 3 to 49.7% in Quarter 1. While NHW and NHB patients differed in this respect, the distributions of hospitals attended for each set of patients was largely similar. Distributions of hospitals attended across quarters were also similar, indicating that the racial makeup of the patient population at each hospital remained relatively constant throughout the twelve months of the COVID-19 pandemic studied in this analysis (March 2020 to February 2021).

To conduct our simulation for both our primary and secondary (stratified) analyses, we first obtained the proportions of total NHW patients at each of the four hospitals, depicted in Figure 3. Then, within each hospital, we randomly assigned patients to one of the four hospitals using a multinomial distribution with probabilities equal to the four proportions found above. Simulated hospital assignments were used to calculate patient- and hospital-specific (counter)factual mortality risk estimates. Since OneFlorida data are centralized, we were able to average all mortality risk estimates from each hospital to calculate . without needing to communicate risk estimate averages from each hospital to a coordinating center. However, if data were not centralized, identical results could be obtained by following the procedure outlined in the previous subsection. We conducted *S* = 500 iterations of the simulation for each of the five analyses, reporting the average difference in observed mortality rate and mean simulated mortality risk across iterations along with a corresponding 95% empirical confidence interval.

## Results

Boxplots summarizing results from each of our simulation studies are presented in Figure 4. Recall that the results presented here are intended to be the product of a proof-of-concept analysis to demonstrate the utility of this simulation method for performing counterfactual modeling. These results therefore should not be interpreted as having clinical significance. Using the complete set of patient data across all index dates, the mean difference across 500 iterations between observed in-hospital mortality rate and simulated mortality risk for NHB patients was 0.0069, reflecting a hypothetical absolute decrease in mortality rate of 0.69% (95% confidence interval (CI): (0.55%, 0.84%)). In our stratified analyses done by index quarter, results varied substantially. In Quarter 1, which included patients with index dates between March 1, 2020 and May 31, 2020, the average difference between observed mortality rate and simulated mortality risk was significantly greater at 0.0161, reflecting a hypothetical absolute decrease in mortality rate of 1.61% (95% CI: (1.51%, 1.70%)). The difference found in Quarter 2 (index dates between June 1, 2020 and August 31, 2020) was closer to the overall difference, with simulated mortality risk lower than observed mortality rate by 0.0051 on average (0.51% absolute difference, 95% CI: (0.35%, 0.67%)). No statistically significant difference in mortality rate was found for Quarter 3 (index dates between September 1, 2020 and November 30, 2020), with an average difference between observed mortality rate and simulated mortality risk of -0.0015 (−0.14% absolute difference, 95% CI: (−0.56%, 0.26%)). In Quarter 4 (index dates between December 1, 2020 and February 28, 2021), the difference between observed mortality rate and simulated mortality risk was similar to that found in Quarter 1 with a mean difference of 0.0131 (1.31% absolute difference, 95% CI: (1.04%, 1.59%)). Differences between observed and simulated mortality rates across iterations varied most in Quarter 3 and least in Quarter 1.

## Discussion

In this work, we presented a novel application of a method for performing distributed generalized linear mixed modeling, the dPQL algorithm, to study the association between attending hospital and racial disparities in COVID-19 mortality. In real-world multi-site studies where hospitals are not able to share patient-level data with one another, this approach can be used to perform counterfactual modeling resulting in the exact same estimates that could be obtained if patient data were centralized, owing to the dPQL algorithm being lossless relative to traditional PQL. By not requiring patient-level data sharing, our proposed application could be used to study hospital-associated racial disparities using larger, more heterogeneous collections of patient data, allowing for more generalizable and clinically impactful conclusions. Our counterfactual modeling approach via simulation is also generalizable, able to be used in a variety of applied contexts beyond the application studied here. This approach could be used to investigate the association between any particular grouping of patients (e.g. at the hospital, state, or country level) and a non-continuous outcome of interest, with multinomial group assignments based on some exposure of interest.

Our real-world analysis of patient data from OneFlorida, while illustrative of our counterfactual modeling approach as a proof-of-concept analysis, is limited in several respects. Despite finding a slight improvement in mortality rate under the hypothetical scenario that NHB patients attend hospitals in the same distribution as NHW patients, we analyzed patient data from four Florida hospitals whose relative distributions of NHW and NHB patients did not differ substantially. The lack of patient heterogeneity among participating hospitals limited the ability of the GLMM fit by the dPQL algorithm to capture meaningful hospital-level differences with respect to in-hospital mortality risk. Our proposed application of the dPQL algorithm is best suited for real-world studies featuring large, heterogeneous collections of hospitals spanning a large geographic area, such as the study conducted in Asch et al. 2020 which included over 44,000 patients admitted to over 1,100 hospitals from 41 states [10]. Additionally, since our analysis was intended to be an illustrative example, there were no inclusion or exclusion criteria for covariates included in our fitted GLMM. In practice, if many candidate predictors are available, a variable section procedure should be used together with clinical expertise to select the final model and model fit should be assessed. Due to the limitations of our applied analysis, results presented should not be interpreted clinically. Rather, they are meant to illustrate the type of analysis that can be performed using our counterfactual modeling approach with multi-site data.

In future work, it would be worthwhile to compare different approaches for distributed generalized linear mixed modeling to counterfactually model in-hospital mortality. The dPQL method is one approach, but more are likely to become available in the near future. It could also be beneficial to continue to investigate whether the association between attending hospital and racial disparities in mortality has changed over time. Our limited analysis did not suggest a trend in either direction, but analysis of a more complete, heterogeneous collection of patient data would provide more convincing conclusions in this regard.

Despite the limitations of our own real-world analysis, we believe this novel application of the dPQL algorithm can be used by researchers as a tool for identifying hospital-level inequities in patient outcomes associated with race. In the event that inequities are present and thought to be related to quality of care, hospital-level interventions may be needed to help close gaps in performance between predominately white and predominately black hospitals. While this would likely take considerable time to accomplish, we hope our method can help to highlight underlying disparities and aid in the process of addressing systemic racism in healthcare.

## Data Availability

OneFlorida data can be requested at https://onefloridaconsortium.org/front-door/; Since OneFlorida data is a HIPAA limited data set, a data use agreement needs to be established with the OneFlorida network.