Introduction
What is new?
- •
Most statistically significant results from meta-analyses of clinical trials are more likely to reflect truly nonnull effects than false-positive results.
- •
It is more probable that the credibility of the updated meta-analyses increases rather than decreases.
- •
Data added to the existing meta-analysis in a 5-year window (2005–2010) indicated less prominent effects than did the summary estimates in 2005.
- •
The median fold change in these summary estimates was 0.85, but the reduction was greater for meta-analyses with less cumulative data (median reduction of 0.67-fold).
Meta-analyses are often considered as the highest level of evidence for evaluating interventions in health care [1], [2] and are very influential in the literature and in practice [3]. However, there has been some debate on whether meta-analyses provide reliable evidence. For example, in an analysis that stirred intense discussion and criticism, LeLorier et al. [4] evaluated 19 meta-analyses and pointed out that these studies had only modest ability to predict the results of subsequent large clinical trials. Meta-analyses with limited evidence, biased studies, and poor-quality trials are considered to be more prone to unreliable results [5], [6], [7], [8], [9], [10]. Other investigators have pointed out that the current interpretation of statistically significant results in meta-analyses ignores the fact that studies are added one at a time, thus one needs more conservative rules to claim statistical significance [7], [10]. When corrections for sequential testing are made, many statistically significant meta-analyses lose their nominal significance [11].
Based on these concerns, clinicians, patients, and policy makers are left with some uncertainty about how they should interpret a meta-analysis, when they see that it has a P-value < 0.05 and its 95% confidence intervals (CIs) exclude the null. How likely is it that there is some genuine treatment effect rather than a “false positive”? Moreover, if there is some effect, is the statistically significant meta-analysis estimate reliable or inflated—and, if so, by how much? Often clinicians and policy makers use nominal statistical significance as a first prerequisite before even considering an intervention for implementation. Then, they may also ask for a sufficiently large treatment effect size. However, there is evidence from diverse fields that, when one focuses on statistically significant results that pass a given threshold of significance (e.g., P < 0.05), some of them are false positives [5] and effect size estimates are inflated on average because of the winner’s curse phenomenon [12]. The winner’s curse refers to the situation where we select results based on the fact that they cross a threshold of significance and at the same time we try to obtain an effect size estimate. It is then mathematically expected that, on average, these estimates are exaggerated [12]. The extent of inflation of effect sizes varies substantially across different studies and scientific fields and is more prominent when the sample size is smaller [12], [13], [14]. False positives and inflation of effects for meta-analyses of clinical trials require more systematic study. Both false positives and inflated effects could cause misleading impressions about an intervention and wrong treatment choices.
Here, we evaluated empirically whether nominally statistically significant results in meta-analyses of clinical trials are credible and the effect sizes from such meta-analyses are potentially inflated. We estimated the credibility (the posterior probability of true-positive results) in independent meta-analyses that had nominal statistical significance in the Cochrane Database of Systematic Reviews (CDSR) in late 2005. Then, we evaluated the change in the credibility of these meta-analyses that had data from additional trials included by early 2010. Moreover, we estimated whether the updating data suggested smaller effects than the initial meta-analyses.