Abstract
Objective To develop an index to evaluate the bias attributable to composite outcomes (BACO) in comparative clinical studies.
Study design and Setting I defined the BACO index as the ratio of the logarithm of the association measure (e.g., relative risk) of the composite outcome to that of its most relevant component endpoint (e.g., mortality). Methods to calculate the confidence intervals and test the null hypotheses (BACO index = 1) were described and applied in systematically selected clinical trials. Two other preselected trials were included as “positive controls” for being examples of primary composite outcomes disregarded due to inconsistency with the treatment effect on mortality.
Results The BACO index values different from one were classified according to whether the use of composite outcomes overestimated (BACO index >1), underestimated (BACO index between zero and <1), or inverted (BACO index <0) the association between exposure and prognosis. In three of 23 systematically selected trials and the two positive controls, the BACO indices were significantly lower than one (p <0.005).
Conclusion BACO index can warn that the composite outcome association is stronger, weaker, or even opposite than that of its most critical component.
What is new?BACO index is the ratio of the logarithm of the association measure (e.g., RR) of the composite outcome to the logarithm of the association measure of the component endpoint that represents the study target (e.g., mortality).
In comparative studies, the BACO index can be used to evaluate the correspondence between the effect on a composite outcome and that on its most critical component.
This index could help to preset rules to make decisions for interpretation of clinical studies.
A significant BACO should lead to the caution that the association of the composite is stronger (BACO index >1), weaker (BACO index between zero and <1), or even opposite (BACO index <0) than that of its most critical component.
1. Introduction
Composite outcomes, particularly those that define an event when at least one of a group of component endpoints occurs, are increasingly used in clinical research [1–3]. In the last two decades, publications using terms referring to composite outcomes increased progressively in PubMed, from less than one hundred per year, before 2004, to more than one thousand per year since 2018. About a third of these publications are associated with the terms “clinical” and “trial” (Figure 1). These figures highlight the importance of this kind of outcome to evaluate interventions in clinical practice.
The use of a composite outcome can have different methodological purposes [3,4]. When several potentially eligible endpoints exist, a composite outcome can avoid the need to choose a simple one and prevent problems associated with multiple comparisons [1,4]. Moreover, it can be a way to deal with competing risks [5]; or deliberately integrate events of different nature, such as indicators of the effectiveness and safety of an intervention [6].
In many other instances, composite outcomes are used to represent severity more broadly than with a single event (e.g., death). This practice facilitates a higher number of events for analysis, which increases the power of the study, reduces its costs, and provides a faster response to a research question [1,2,4]. However, the association measures are often misinterpreted as the effect of exposure on each of the elements of the composite outcome [2,7]. This is problematic when the overall effect on the composite outcome does not follow the same direction as its most critical components.
In some cases, effects on primary composite outcomes have been ruled out because they differ greatly from those on mortality [8–10]. This may occur because the composite outcome includes events generated by different mechanisms. For example, some components may be directly related to severity (heart attacks, death), while others may be more influenced by medical decisions and resource availability (hospitalization, catheterization). Therefore, some component endpoints can introduce bias that affects the estimation of the overall effect.
Despite their importance, little guidance is available on how composite outcomes should be interpreted, especially in situations of varied direction in the association across the event subtypes [4,6,11]. Moreover, there is a lack of statistical tools to support the decision to accept or rule out the use of the composite outcome. Furthermore, because composite outcomes are inherently correlated with their components [7], comparative estimates to quantify and test biases associated with the use of this type of endpoint are challenging to make.
In this paper, I proposed an index to evaluate the bias attributable to the composite outcome (BACO), which is simply the ratio of the logarithm of the association measure of the composite outcome to that of the component endpoint that represents the study target (e.g., mortality). For this purpose, I described how to calculate confidence intervals and perform hypothesis tests. Then, I applied these procedures in a group of clinical trials recently published in major medical journals.
2. Methods
2.1. BACO Index
I aimed to compare the association measure of a composite outcome with that obtained with a component endpoint, which indisputably represents the study target. Here, I chose the any-cause mortality as the target [12], and defined the BACO index as follows:
Where φc and φd are the natural logarithms (ln) of ratio-based association measures for a composite outcome and death, respectively. In this paper, I mean relative risk (RR) as the association measure; however, the concept could be applied to other measures (e.g., hazard ratio or incidence rate ratio).
A BACO index equal to one indicates that there is no bias attributable to the use of a composite outcome. A BACO index higher than one indicates that the association with the composite outcome is stronger than that with death. A value between zero but less than one suggests that the association with the composite outcome is biased towards nullity. On the other hand, a negative value would result when the bias leads to an inversion of the association.
These interpretations can be applied regardless of the reference group in the comparative study. Because if the comparison groups were inverted, the signs of both the numerator and denominator would also be inverted.
Since the BACO index is a ratio between two correlated variables, calculating its variance, υ(θ), requires considering the covariance between φc and φd. Specifically, by applying the Taylor series estimator [13], we have that
Where υ(φc) and υ(φd) are the variances of φc and φd, respectively, and c(φc, φd), is their covariance. This covariance is usually calculated through complex matrix operations [13,14]. However, in the case of the BACO index, we analyze effects on two outcomes, one of which is completely embedded within the other. Therefore, φc can be disaggregated as a sum of two effects: on death and on the other component endpoints, namely, φc.= φd + φc−d. As detailed in the Appendix A, with this approach, I deduced that c(φc, φd) = υ(φc). Consequently, the formula of υ(θ) can be simplified to:
A 95% confidence interval for the BACO index is
And the statistic with normal distribution to test the null hypothesis (BACO index = 1) is
Thus, the BACO index can be estimated and tested directly from contingency tables (Appendix A). Alternatively, we can combine regression parameters, which may be useful to estimate a BACO index from adjusted association measures. For example, the ln(RR) can be estimated as the coefficient of a Poisson regression, which is an alternative to avoid the problems of convergence of the Log-Binomial regression [15]. After, we can combine the results into one parameter vector and simultaneous (co)variance matrix of the sandwich/robust type, which is appropriate even if the estimates were obtained with the same or overlapping data [16]. Then, using the non-linear combination of parameters, the BACO indices can be calculated as the ratio of the regression coefficient of the composite outcome to that of mortality (see example in Appendix B). These procedures lead to results virtually identical to those obtained directly from contingency tables (Appendix C).
2.2. Estimation with simulated data
To illustrate different results of the BACO index, I simulated a comparative study in which a group of a thousand people was exposed to an experimental intervention and presented mortality of 4.8% during the follow-up. This group was compared with a reference group of another thousand people, which showed an 8% mortality. Consequently, the RR for death was 0.6 (4.8%/8%), in other words, the intervention had an efficacy of 40% for reducing mortality.
Now consider four composite outcomes, all of which included death within their components. The first composite outcome maintained the same RR as observed for mortality. The second outcome led to an overestimation of the treatment effect with a RR of 0.3 (efficacy of 70%). On the contrary, the third outcome led to an underestimation of the effect with a RR of 0.8 (efficacy of 20%). Finally, the fourth composite outcome led to an inversion of the association measure, suggesting that the intervention duplicates the risk of the event (Table 1).
Figure 2 illustrates the point estimates and the corresponding 95% confidence intervals (95% CI). For the first composite outcome (unbiased), the BACO index was equal to one (95% CI: 0.46 to 1.54). For the second composite outcome, which overestimated the effect of the intervention on prognosis, the BACO index was 2.36 (95% CI: 1.14 to 3.58). The third outcome, which underestimated the effect, presented a BACO index of 0.44 (95% CI: 0.11 to 0.76). For the last simulated composite outcome, which had an inverted association measure, the BACO index was negative: -1.36 (95% CI: -2.48 to -0.24).
2.3. Application
I searched PubMed to identify 2-group, parallel-design clinical trials published in the journal groups of JAMA, NEJM, and Lancet in 2019 (updated on 07/01/2020), using the following word combination: composite primary (endpoint OR outcome OR (“end-point”)) (mortality OR death) (randomised OR randomized) (trial) (JAMA OR NEJM OR Lancet). Next, I selected those studies whose primary outcome was a composite, binary, and included all-cause mortality within their components. Secondary subgroup analyses and studies with five or fewer fatal events were excluded. A study was also excluded whose outcome results were mainly based on imputations because it presented substantial losses during the follow-up [17].
I reviewed each article and built a database to reproduce individual information for the variables of intervention, composite outcome, and death. When necessary, I contacted the corresponding author to ask for data not provided in the article. For each trial, I calculated the RR for both the composite outcome (RRc) and death from any cause (RRd), using the corresponding intervention as the independent variable. After, I estimated the BACO index as a ratio of regression coefficients and performed Wald-type tests, based on the delta method, for the null hypothesis that the BACO index is equal to one. These analyses were conducted using Stata software (version 15.0, Stata Corp LP, College Station, TX, USA), and the commands employed are described in the Appendix B.
This work intended to reproduce the independent application of the BACO index in each of the studies. Therefore, I did not consider the number of trials to adjust the level of significance. However, I preset 0.005 as a level to define a statistically significant BACO [18]. A higher level (e.g., 0.05) was not chosen because I assumed that the composite outcomes were purposefully defined to be consistent with mortality, thus expecting a low pre-test probability that the BACO index is different from one. However, I used the term “suggestive” for p values between 0.005 and 0.05 [19].
Besides the selected studies from 2019, I also analyzed the clinical trials CAPRICORN and EXPEDITION [20,21]. CAPRICORN investigated carvedilol in patients with left-ventricular dysfunction after acute myocardial infarction [20]; and EXPEDITION evaluated intravenous caripode in high-risk coronary artery bypass graft surgery patients [21]. These two studies were included as “positive controls” because they represented examples of primary composite outcomes that were disregarded for being inconsistent with the treatment effect on mortality [9,10].
3. Results
From 82 articles, I selected 23 clinical trials, most of which were about cardiovascular diseases. Besides mortality, the composite outcomes integrated diverse components, often including cardio-cerebrovascular events, such as myocardial infarction or stroke, and other related to the use of health services, such as hospital admission and vascular interventions. The sample sizes of these trials ranged from 240 to 536,233. While the number of patients evolving composite outcomes ranged from 31 to 4,067, the number of deaths ranged from six to 1,140 (Table 2).
In six studies [22–27], the RRs of the composite outcomes were further from the null value than the corresponding RRs of death (Table 3). Consequently, their BACO indices were greater (although not statistically different) than one. One study had the same association measure for both the composite outcome and any-cause death (BACO index = 1) [28].
BACO index was lower than one in the other 16 studies [29,30,39–44,31–38], and the BACO was statistically significant in three of them. These included the study by Yasuda et al., in which a monotherapy with rivaroxaban was compared to a combination therapy with rivaroxaban plus a single antiplatelet agent [32]. The apparent effect of the monotherapy on the composite outcome (RRc: 0.74; 95% CI: 0.57 to 0.96) was lower than the effect on mortality (RRd: 0.56; 95% CI: 0.39 to 0.82), in patients with atrial fibrillation and coronary disease. Thus, the BACO index was 0.53 (95% CI: 0.21 to 0.86; p=0.0048).
The second study with statistically significant BACO was that by Lanz et al., which compared a self-expanding versus a balloon-expandable bioprosthesis for transcatheter aortic valve replacement in patients with symptomatic severe aortic stenosis [37]. The self-expanding bioprosthesis group presented an incidence of the composite outcome about 43% higher than observed in the balloon-expandable group (RRc: 1.43; 95% CI: 1.06 to 1.92). However, mortality in the first group was about three-times that of the second one (RRd: 2.96; 95% CI: 0.81 to 10.8). In this trial, the BACO index was 0.33 (95% CI: -0.1 to 0.76; p=0.002).
In the other study, Onland et al. evaluated the effect of systemic hydrocortisone compared with placebo on a composite outcome of death or bronchopulmonary dysplasia in very preterm infants [38]. The intervention was not associated with a significant change in the composite outcome incidence (RRc= 0.95; 95%CI: 0.84 to 1.08). However, the hydrocortisone group exhibited lower mortality compared to placebo (RRd: 0.65; 95% CI: 0.42 to 0.99). BACO index in this case was 0.11 (95% CI: -0.17 to 0.39; p<0.001).
Three other works were suggestive of a significant BACO, in which the RR of the composite outcome was closer to null value than the RR of mortality [33,34,39]. These studies included the clinical trials by Schuetz et al. (BACO index: 0.52; 95%CI: 0.07 to 0.98; p=0.04); Stone et al. (BACO index: 0.52; 95% CI: 0.07 to 0.98, p=0.03); and Nagel et al., which exhibited a negative point estimate of the BACO index (−0.06; 95% CI: -1.1 to 0.98, p=0.04).
In the positive controls, the CAPRICORN showed that carvedilol, compared with placebo, was not significantly associated with the composite outcome (RR = 0.94; 95% CI: 0.84 to 1.06) but was associated with a 22% reduction in mortality (BACO index: 0.24; 95% CI: 0.15 to 0.64). On the other hand, in the EXPEDITION study, the use of cariporide was associated with an 18% lower incidence of the composite outcome, but with a 53% higher mortality, compared with placebo (BACO index: -0.48; 95% CI: -1.03 to 0.08). The BACO indices of these two studies were significantly lower than one (p <0.001).
4. Discussion
Composite outcomes can make the interpretation of results challenging [1,8]. Differential effects on their less critical but more frequent components may result in a misleading impression about the impact of a treatment [2,45]. Therefore, it has been recommended that if there is a great variation between the effects on the components, the composite outcome should be abandoned [8].
However, assessing this is difficult, considering the asymmetric distribution of association measures and random variations. Some authors have evaluated the differences between the associations of both the composite outcome and mortality based on disagreement in statistical significance [1]. This type of comparison is biased because of the fatal outcome is a sub-element of the composite and will always have fewer events [4]. Therefore, there will be less power to evaluate an association with mortality.
The BACO index summarizes the relationship between the associations of the composite outcome and its most critical endpoint. Being based on the logarithms, the comparison is more consistent with the association measure distributions. On the other hand, the integration into a single index allows a unique statistical test for the null hypothesis.
Other authors stated they had planned to calculate a ratio between the efficacy for the composite outcome and that for mortality [7]. However, they considered that it would be problematic because the observations were not independent, as the death contributes to the composite. In that sense, the methodology proposed to obtain the BACO index solves this problem by considering overlapping observations [16,46,47].
Despite this, a limitation for an index based on the ratio of efficacy measures is that denominators close to zero lead to unstable or seemingly inflated results [7,48]. Hence, a BACO index should not be computed together with another whose reference effect is different. In other words, comparisons between BACO indices only make sense to contrast two or more composite outcomes when they consider the same reference value (e.g., of RR for mortality). In other circumstances, it is prudent to interpret the BACO indices only by classifying them into descriptive categories (overestimation, underestimation, and effect inversion) and considering the null hypothesis test.
In practice, the BACO index proved to be a simple measure to validate the composite outcome in clinical trials. Three of the analyzed studies had an index significantly lower than one, suggesting that the composite outcome underestimated the association between the intervention and the prognosis. Additionally, three other studies were suggestive of a similar trend, i.e., the composite outcomes seemed to dilute the associations that were stronger for mortality.
In the positive controls, the BACO index was significantly different from one. These studies have been well recognized as examples of bias associated with the composite outcome. After many discussions about the results of the CAPRICORN and EXPEDITION trials, their composite outcomes were disregard and the decisions based on the effects on all-cause mortality [9,10]. This would be consistent with the results of the BACO index.
Facing the need for objective guidelines, the BACO index may be a statistical tool to help interpret composite outcomes. Moreover, its application could be adapted according to the research context. For example, the level of significance could be adjusted depending on the desired sensitivity to identify a BACO. Furthermore, we can calculate the index from adjusted regression coefficients in observational studies that require controlling confusion.
This work is based on the expectation that the composite outcome must be in the same direction as its most relevant component. This is important when what is sought is that the composite outcome offers more events to increase power. In other cases, the composite outcome may not necessarily have a similar association magnitude as mortality. For example, when an intervention is expected to improve the quality of life without affecting total survival. However, even in these cases, a significant BACO could lead to reconsidering the necessity of a composite outcome or disaggregating the estimations for the component endpoints adequately.
5. Conclusions
The BACO index calculation could be incorporated into the analysis plan of clinical studies. Thus, based on a predefined rule, researchers could make impartial decisions about maintaining or replacing a composite outcome as the primary endpoint. Even if the researchers decide to base their conclusions on the composite outcome, a significant BACO should lead to caution that the association of the composite is stronger, weaker, or even opposite than that of its most critical component.
Data Availability
All data necessary to verify the results were included in the manuscript or are available in the papers cited in the references.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. FADQ was granted a fellowship for research productivity from the Brazilian National Council for Scientific and Technological Development – CNPq, process/contract identification: 312656/2019-0. Conflicts of interest: None.
Acknowledgments
I thank my children, Alejandro and Ana Maria, for helping me organize the bibliographic material. I also thank Dr. Kai Arzheimer, for his guidance on Stata commands; and Alene Alder-Rangel, for her help with the English edition of the article.
Appendix A. On the formula to estimate the variance of the BACO index
Let φc and φb be natural logarithms (ln) of the relative risks for the composite (RRc) and the target outcome (RRd), respectively. According the following contingency table:
We have that:
Consider the formula of the Taylor series estimator of the variance of an estimator of a ratio, υ(Ṙ), of random variables (Y and X, being Ṙ = Ẏ/Ẋ), according to which:
Where υ(Ẏ) and υ(Ẋ) are variances of the estimators of Y and X, respectively, and Ẏ, Ẋ is the covariance between them. Similarly, to calculate the variance BACO index,, we have:
Where υ(φc) and υ(φd) are the variances of φc and φd, respectively, and c(φc, φd), is their covariance.
Now, let a term φc−d such as:
Consequently, variance estimators of φc, φd, and φc−d would be respectively:
By considering the additive properties of variance and covariance, we have that
Moreover, because
Then,
Consequently, the estimator of the variance of the BACO index can be
A 95% confidence interval for the BACO index is
And the statistic Z with normal distribution to test the null hypothesis (BACO index = 1) is
Appendix B. Stata code to estimate and test the BACO index by combining regression parameters
Suppose we have a dataset with exposure or treatment variable “t”, composite outcome variable “c”, and the most critical endpoint variable “d” (i.e., death). For this example, take the data of the CAPRICORN study, which compared 975 patients allocated to experimental intervention vs. 984 patients in the control group [20]. The composite outcome occurred in 340 and 365 patients, including 116 and 151 deaths, respectively. Consequently, the RRs were 0.94 (95% CI: 0.84 – 1.06) for the composite outcome and 0.78 (95% CI: 0.62 – 0.97) for death. Before calculating the BACO index based on the relative risks, we should store and integrate the regression coefficients as described in the code that follows:
poisson c t
est store coefcomp
poisson d t
est store coefdeath
suest coefcomp coefdeath
In this example, the coefficients generated by the regression models were stored under the names “coefcomp” and “coefdeath”. Then, by using “suest” (seemingly unrelated estimation) command, we combine the stored parameter estimates and associated (co)variance matrices. Now we can use “nlcom” as described in the code that follows. The output (presented in the box) will include the estimate of the BACO index, its standard error, and a 95% confidence interval.
Please note that the previous p-value (0.23) refers to the null hypothesis of the BACO index equal to zero (therefore, it should not be considered). To calculate the p-value for the null hypothesis of the BACO index equals one, we can use the “testnl” command:
Appendix C. Comparison of methods to calculate the standard error of the BACO index
Consider the four hypothetic composite outcomes described in the simulation (section 2.2). Table C.1 presents the standard error of each BACO index, , calculated both “directly” from the contingency tables, using formulas described in Appendix A, and by the nonlinear combination of regression parameters (NLCRP), using the tools described in Appendix B.
In these examples, the two methods were virtually equivalent, and differences between them were observed only from the fourth decimal place (at the order of ten thousandths).
Footnotes
The manuscript went through a peer-review process and important elements were added, including formulas to estimate and test the BACO index.