Application of BRACE Method to Address Treatment Selection Bias in Observational Data

Background: Cancer treatments can paradoxically appear to reduce the risk of non-cancer mortality in observational studies, due to residual confounding from treatment selection bias. Here we apply a novel method, Bias Reduction through Analysis of Competing Events (BRACE), to reduce bias in the presence of residual confounding. Methods: We studied 36630 prostate cancer patients, 4069 lung cancer patients, and 7117 head/neck cancer patients, using the Veterans Affairs Informatics and Computing Infrastructure database. We estimated effects of intensive treatment (prostate: prostatectomy vs. radiotherapy; lung: lobectomy vs. sublobar resection or radiotherapy; head/neck: radiotherapy with concurrent cisplatin and/or multiagent induction vs. radiotherapy with or without alternative systemic therapy) on cancer-specific mortality, non-cancer mortality, and overall survival (OS), using both multivariable Cox (MVA) and propensity score (inverse probability treatment weighting (IPTW)) models. Next, we applied the BRACE method to adjust for residual confounding, based on the observed treatment effect on competing event and relative event hazards. Results: For each cohort, intensive treatment was associated with significantly reduced hazards for cancer-specific mortality, non-cancer mortality, and OS. Compared to the results for MVA and IPTW models, hazard ratios (95% confidence intervals) for the effect of intensive treatment on OS were attenuated in each cohort after applying BRACE: (prostate- MVA: 0.75 (0.71, 0.80), IPTW: 0.73 (0.66, 0.75), BRACE: 0.98 (0.95, 1.00); lung- 0.79 (0.68, 0.91), 0.79 (0.66, 0.89), BRACE: 0.81 (0.65, 0.94); head/neck- 0.71 (0.66, 0.76), 0.70 (0.66, 0.76), BRACE: 0.81 (0.76, 0.86)). BRACE estimates were similar to findings from meta-analyses and randomized trials. Conclusions: We found evidence of residual confounding in several observational cohorts after applying standard methods, which were mitigated after applying BRACE. Application of this method could provide more reliable estimates and inferences when residual confounding is identified and represents a novel approach to improving the validity of outcomes research.


INTRODUCTION 59
Bias due to residual confounding (often called treatment selection bias) is an important issue when 60 drawing inferences from non-randomized comparative effectiveness studies. 1,2 In observational data, 61 multivariable regression models and propensity scores are common approaches to reduce bias from measured 62 confounders. 3 However, residual confounding from unmeasured or unknown confounders remains a pernicious 63 problem that can undermine conclusions from such analyses and cannot be overcome by adjustment, scoring, or 64 weighting methods. 4-8 Importantly, biased inferences from observational data can mislead the medical field, 65 resulting in patients receiving toxic, costly, and ineffective therapies. 9,10 66 Competing event analysis allows for identification of residual confounding problems in observational 67 data, particularly when the effect of a treatment on competing events can be bounded a priori. 11 For example, 68 while the addition of a novel cancer treatment to a standard regimen may have no effect on or even increase 69 mortality from non-cancer health events, such as cardiac disease, it should not intrinsically reduce the incidence 70 of such non-cancer events. Despite this, in non-randomized data, competing event analysis can reveal a lower 71 incidence of competing health events in the group receiving intensified treatment, due to unmeasured confounding 72 by more favorable health characteristics in this group, even after appropriately controlling for measurable 73 confounders. 12 When present, this phenomenon typically indicates the presence of residual confounding, 74 assuming that more intensive treatment does not truly reduce the risk for competing events. 75 While diagnosing residual confounding with a competing events analysis is helpful, 12 there remains no 76 consensus on how to address it. 2,4 Here we apply a novel method, Bias Reduction through Analysis of Competing 77

Population and Sampling Methods 85
We applied BRACE to observational cohorts of patients treated for prostate cancer, lung cancer, and 86 head/neck cancer, sampled from the Veterans Affairs (VA) Informatics and Computing Infrastructure (VINCI) 87 database. VINCI contains detailed electronic medical records for veterans treated across the United States with 88 tumor registry data collected by trained registrars according to standardized protocols. 14 Further details on each 89 cohort are provided below. This study was approved by our institutional and local VA institutional review boards. 90 Waiver of informed consent was obtained. 91 Outcomes 92 The primary outcome of interest was overall survival (OS). For competing risks analysis, we analyzed 93 two events: cancer-specific mortality and non-cancer (competing) mortality. Patients with documented follow-up 94 visits and no death event were coded as alive at last follow-up with event times censored. Date of death and cause 95 of death were obtained via the National Death Index from the Department of Defense for deaths through 2014 and 96 tumor registry data for deaths after 2014, which are linked to the VA data by social security numbers. Survival 97 times (days) were measured from the date of diagnosis. 98

Prostate Cohort 99
The prostate cancer cohort included 36,630 patients with cT1-T2 cancer of the prostate, with prostate-100 specific antigen (PSA) < 20, who were diagnosed between 2000 and 2015 and received radical prostatectomy 101 therapy was given within 6 months of RT/surgery; duration of hormonal therapy could not be ascertained. 108 Alcohol use was excluded from each model after backward selection. 109 For additional external validation (i.e., in a cohort where we did not determine the sampling methods), we 110 applied the BRACE method to a cohort of patients from SEER data with low-risk prostate cancer (cT1-T2a, PSA 111 <10, and Gleason 6) who received radical prostatectomy, brachytherapy, or external beam radiotherapy from 112 2005-2015, as described in detail elsewhere. 15  The lung cohort included 4,069 patients with biopsy-proven clinical stage I (T1 or T2a, N0) non-small 121 cell lung cancer (NSCLC) diagnosed between 2006 and 2015 and treated definitively with surgery (lobectomy or 122 sublobar resection) or RT, as previously described. 16 The primary treatment effect of interest for this cohort was 123 lobectomy vs. sublobar resection or definitive RT. Missing variables were imputed using iterative robust model-124 based imputation (IRMI) 17 . Covariates were age (categorical in 10-year increments), sex, race 125 (White/Black/Other), smoking status (never/current/past), CCI (0/1/2/3+), pretreatment forced expiratory volume 126 in one second (FEV1) (pre-treatment percent predicted, categorical, <30%, 31-50%, 51-80%, >80%), T category 127 (T1a vs T1b vs T2a) The head and neck cohort included 7,117 patients with locoregionally advanced, non-metastatic (AJCC 132 7th edition stage III-IVB) squamous cell carcinoma of the oropharynx, oral cavity, larynx, and hypopharynx 133 diagnosed between 2005 and 2015 and treated with definitive (at least 5 weeks of) radiation therapy (RT) with or 134 without chemotherapy, as previously described. 18 The primary treatment effect of interest for this cohort was 135 intensive therapy (defined as RT with concurrent cisplatin or with multiagent induction chemotherapy) vs. Unadjusted and multivariable Cox proportional hazards models (MVA) were fit for each outcome and 149 each cohort. Adjustment variables were determined using backward selection, retaining covariates found to be 150 associated with OS (threshold: p < 0.20). The proportional hazards assumption was checked (cox.zph function in 151 the survival package in R), and when violated, treatment was modeled as a time-varying covariate. For propensity 152 score adjustment, we implemented inverse probability treatment weighting (IPTW) with multivariable Cox 153 models using stabilized weights derived from the same sets of covariates. An average treatment effect (ATE) 154 approach was used for estimation of treatment effects. Each IPTW model was checked for covariate balance 155 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
BRACE was then applied to the IPTW model estimates to obtain bias-corrected estimates (Θ � ). 165 Bootstrapped confidence intervals for Θ � were estimated with 500 replicates. Monte Carlo estimates of Θ � were 166 obtained by randomly co-sampling values of Θ � 2 and ω � from their respective distributions (1000 replicates). 167 Confidence intervals were defined by the 2.5 th and 97.5 th percentiles of the sampling distributions. The BRACE 168 method does not generate p-values; confidence intervals were compared between methods. More details regarding 169 BRACE derivation were described previously. 13 170 171

RESULTS 172
There were 36630 patients in the prostate cohort (Table 1), 4069 patients in the lung cohort (Table 2), 173 and 7117 patients in the head and neck cohort ( Table 3). On balance check for each IPTW model, all covariables 174 had a mean difference of < 0.05, indicating appropriate balance after weighting by propensity score 175 (Supplemental Figure 1). Results for standard approaches using either multivariable Cox models or IPTW 176 models were largely similar and are presented in tabular form, with comparison of IPTW vs. BRACE-corrected 177 estimates emphasized in the text. 178 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 18, 2021. ; https://doi.org/10.1101/2021.05.17.21257332 doi: medRxiv preprint For prostate cancer patients, prostatectomy was associated with significantly reduced CSM compared to 179 RT after adjusting for covariates using IPTW (hazard ratio (HR) 0.82, 95% confidence interval (CI): 0.64, 0.92; 180 p=0.02) (Figure 1; Table 4). Prostatectomy was also associated with significantly reduced non-cancer mortality 181 (HR 0.71, 95% CI: 0.65, 0.74; (p<0.001) and improved OS (HR 0.73, 95% CI: 0.66, 0.75; p<0.001) with IPTW 182 (Figure 1; Table 4). After correction using BRACE, however, the effect on OS was attenuated substantially (HR 183 0.98; 95% CI: 0.95, 1.00) with the upper bound of the 95% CI including the null (  (Figure 1; Table 4). 195 When the proportional hazards assumption was not met, we modeled treatment as a time-varying 196 covariate. Supplemental Table 1 shows results for time periods over which the assumption held. In general, the 197 corrected OS estimates were nearly identical, but in the prostate cancer cohort, the null hypothesis was not 198 rejected using BRACE, and in the lung cancer cohort, the standard and BRACE estimates diverged more, 199 indicating sensitivity to the proportional hazards assumption. Typical strategies to address treatment selection bias in observational data include multivariable Cox 206 proportional hazards regression and propensity score modeling. 26-30 While valuable, these methods may be 207 insufficient to eliminate residual confounding, leading to erroneous inferences. 1-8 Competing risks analysis can 208 diagnose residual confounding by identifying mechanistically implausible effects of treatment on competing 209 health events. 11,12 In each cohort we examined, while there were strong associations between intensive treatment 210 and improved survival, competing risks analysis revealed these effects were driven in part by associations with 211 reduced non-cancer mortality, even after adjusting for numerous measurable confounders. The likely explanation 212 for this is the selective use of intensive treatment in patients with more favorable baseline health characteristics, 213 thus leading to reduced non-cancer mortality, rather than effects on the competing event per se. However, simply 214 identifying this problem does not inherently provide a method to address it. 215 Here we applied a recently described method to attenuate bias, which was previously shown to result in 216 lower model error compared to standard approaches in simulated data. 13 While true effects are not directly 217 observable in non-randomized studies, multiple applications of BRACE to clinical cohorts yielded attenuated 218 treatment effect estimates more consistent with high-level evidence than uncorrected estimates. Such comparisons 219 should be viewed with caution, given methodological and population differences across studies, but they can lend 220 insight when comparing potentially biased results. 221 For example, the randomized ProtecT trial found no difference in OS by treatment in 1,643 patients with 222 predominantly low-risk prostate cancer, 31 which our results generally support. While ProtecT did not directly 223 quantify the effect of prostatectomy vs. radiotherapy, both were compared to a common control group (active 224 monitoring), with nearly identical effects (Table 4). Similarly, the MACH-NC meta-analysis, which investigated 225 the effect of chemotherapy in addition to RT for 16485 patients across 87 randomized trials, reported an effect of 226 chemotherapy on survival (HR 0.88), close to our BRACE-adjusted estimate. 32 While evidence regarding the 227 comparative effectiveness of treatments for early stage NSCLC are conflicting, 33-41 our results indicated a survival 228 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 18, 2021. ; https://doi.org/10.1101/2021.05.17.21257332 doi: medRxiv preprint advantage to lobectomy after BRACE correction. Though large randomized trials are lacking, in the meta-analysis 229 by Zheng et al., the effect was attenuated in trials with higher levels of evidence. 33 Of note, in some studies, 230 lobectomy has been associated with higher 90-day mortality compared to stereotactic ablative radiotherapy 231 (SABR); 41 if true, BRACE would under-correct for treatment selection bias. 232 Our findings thus have important implications regarding analyses using observational data. For example, 233 the National Cancer Database lacks cause-specific event data, precluding the application of BRACE and leaving 234 many analyses vulnerable to undiagnosed bias. Missing or inadequate comorbidity data may also contribute to 235 residual confounding; application of BRACE in databases that include cause-specific event data could help 236 mitigate this problem proactively. 237 This study has several limitations. Notably, the proportional relative hazards model treats several key 238 quantities as independent, a strong condition that is not always verifiable. In our analysis of clinical data, it is 239 important to note the corrected estimates could still be biased. Furthermore, inferences can be sensitive to the 240 proportional hazards assumption or to the method used to estimate confidence intervals, especially when close to 241 the null. Moreover, gains using BRACE depend on leveraging a critical assumption: namely, that treatment does 242 not reduce the hazard for non-cancer events. While this is generally valid when comparing more vs. less intensive 243 treatments (e.g., A vs. A+B designs), in other contexts it may not be possible to bound the effects of a treatment 244 on competing events, such as when comparing two systemic therapies). 245 In summary, we present the clinical application of a novel method (BRACE) to mitigate bias from 246 residual confounding. Appropriate application in observational, non-randomized data would likely improve effect 247 estimation and inferences. Administration and SEER, respectively, but restrictions apply to the availability of these data, which were used 252 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 18, 2021.  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 18, 2021.

384
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 18, 2021. ; https://doi.org/10.1101/2021.05.17.21257332 doi: medRxiv preprint Table 4. Effects of intensive treatment approaches on cancer-specific mortality, non-cancer mortality, and overall 386 survival for each clinical cohort. Results are presented from unadjusted, Cox multivariable (MVA), and inverse 387 probability treatment weighting (IPTW) models. The Bias Reduction through Analysis of Competing Events 388 (BRACE) correction was applied to IPTW estimates. *statistically significant with p < 0.05, **statistically 389 significant with p < 0.001. †Note that the BRACE method does not result in a p value. ⱡReference values were 398 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 18, 2021. ; https://doi.org/10.1101/2021.05.17.21257332 doi: medRxiv preprint