Plea for routinely presenting prediction intervals in meta-analysis

Joanna IntHout; John P A Ioannidis; Maroeska M Rovers; Jelle J Goeman

doi:10.1136/bmjopen-2015-010247

Article Text

PDF

XML

Research methods

Research

Plea for routinely presenting prediction intervals in meta-analysis

Joanna IntHout1,
John P A Ioannidis2,3,4,5,
Maroeska M Rovers1,
Jelle J Goeman1

¹Radboud University Medical Center, Radboud Institute for Health Sciences (RIHS), Nijmegen, The Netherlands
²Department of Medicine, Stanford Prevention Research Center, Stanford University School of Humanities and Sciences, Stanford, California, USA
³Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California, USA
⁴Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California, USA
⁵Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California, USA

Correspondence to Dr Joanna IntHout; Joanna.IntHout{at}radboudumc.nl

Abstract

Objectives Evaluating the variation in the strength of the effect across studies is a key feature of meta-analyses. This variability is reflected by measures like τ² or I², but their clinical interpretation is not straightforward. A prediction interval is less complicated: it presents the expected range of true effects in similar studies. We aimed to show the advantages of having the prediction interval routinely reported in meta-analyses.

Design We show how the prediction interval can help understand the uncertainty about whether an intervention works or not. To evaluate the implications of using this interval to interpret the results, we selected the first meta-analysis per intervention review of the Cochrane Database of Systematic Reviews Issues 2009–2013 with a dichotomous (n=2009) or continuous (n=1254) outcome, and generated 95% prediction intervals for them.

Results In 72.4% of 479 statistically significant (random-effects p<0.05) meta-analyses in the Cochrane Database 2009–2013 with heterogeneity (I²>0), the 95% prediction interval suggested that the intervention effect could be null or even be in the opposite direction. In 20.3% of those 479 meta-analyses, the prediction interval showed that the effect could be completely opposite to the point estimate of the meta-analysis. We demonstrate also how the prediction interval can be used to calculate the probability that a new trial will show a negative effect and to improve the calculations of the power of a new trial.

Conclusions The prediction interval reflects the variation in treatment effects over different settings, including what effect is to be expected in future patients, such as the patients that a clinician is interested to treat. Prediction intervals should be routinely reported to allow more informative inferences in meta-analyses.

Meta-analysis
Prediction interval
Heterogeneity
Random effects
Clinical trial
Cochrane Database of Systematic Reviews

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

https://doi.org/10.1136/bmjopen-2015-010247

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

In many meta-analyses, there is large variation in the strength of the effect.
The prediction interval helps in the clinical interpretation of the heterogeneity by estimating what true treatment effects can be expected in future settings.
In case of heterogeneity, prediction intervals will show a wider range of expected treatment effects than CIs, and thus may lead to different conclusions. This occurred in over 70% of statistically significant meta-analyses with heterogeneity of the Cochrane Database of Systematic Reviews. Completely opposite effects were not excluded in over 20% of those meta-analyses.
Prediction intervals should be routinely reported to allow more informative inferences in meta-analyses.
Limitations are that the calculations and inferences for the prediction interval are based on the normality assumption, which is difficult to ensure. Further, the interval will be imprecise if the estimates of the summary effect and the between-study heterogeneity are imprecise, for example, if they are based on only a few, small studies. Inferences based on the prediction interval are only valid for settings that are similar (exchangeable) to those on which the meta-analysis is based.

Introduction

Interventions may have heterogeneous effects across studies because of differences in study populations, interventions, follow-up length or other factors like publication bias.1 Nevertheless, the usual reporting of a meta-analysis is focused on the summary effect size combined with a CI and p value. Typically also some measure of the between-study heterogeneity is presented such as τ² or the inconsistency measure I².2 ,3 However, neither of these two metrics can readily point to the clinical implications of the observed heterogeneity. Our objective in the current article is to show the potential advantages of obtaining and reporting the prediction interval routinely in meta-analyses because its clinical meaning is much more straightforward. The prediction interval presents the heterogeneity in the same metric as the original effect size measure, in contrast to τ² or I². Reporting a prediction interval in addition to the summary estimate and CI will illustrate which range of true effects can be expected in future settings. We describe its merits and provide working examples to show how it can be calculated.

Methods

Interpretation of heterogeneity

Between-study variation in the magnitude of treatment effects cannot be neglected. One of the main merits of a meta-analysis may even be that it reveals the variation of effects in different studies.4 Therefore, summarising the findings of a meta-analysis in a single summary value sacrifices potentially informative variation.5 However, the information that can be directly retrieved from τ² and I² with respect to the variation in the effects is limited. The clinical interpretation of I² is ambiguous: a high I² does not necessarily imply that the study effects are dispersed over a wide range6 and a low I² might correspond to high dispersion,7 because I² depends on sample size of the included studies.8 With very large (highly precise) studies, even tiny differences in effect size may result in a high I², while with small (imprecise) studies, very different treatment effects can yield an I² of 0. Dispersion in treatment effects is better reflected by τ because τ is the SD of the between-study effects. One could, for example, estimate the ratio of the effect size over τ, which can convey how many times larger the treatment effect is compared with the SD of the effect across studies.9 But this may still be not very intuitive to a clinical reader. Another popular way to express variation in effect sizes is the CI, for example, the 95% CI. The CI in a random-effects model contains highly probable values for the summary treatment effect. However, it does not convey what range of treatment effects are likely to be seen in other patients, for example, in the next study or in the patients a clinician wants to treat in her clinic.

Prediction intervals

Not so often reported but much more insightful is the prediction interval.10 A prediction interval always presents the heterogeneity on the same scale as the original outcomes, in contrast to τ (eg, in case of ORs), τ² or I². A 95% prediction interval estimates where the true effects are to be expected for 95% of similar (exchangeable) studies that might be conducted in the future.4 Therefore, it is well suited to evaluate the variability of the effect of an intervention over different settings. For example, in a meta-analysis on sedentary time in adults and the association with diabetes, cardiovascular disease and death, CIs were thought to represent insufficiently the different study populations. Therefore, also prediction intervals were reported.11 In the absence of between-study heterogeneity, the prediction interval coincides with the respective CI. However, in case of heterogeneity, a prediction interval covers a wider range than a CI. Consequently, in case of a statistically significant effect (where all values of the 95% CI are on the same side of the null), the corresponding 95% prediction interval may indicate that values are possible on both sides of the null. This means that there will be settings where conclusions based on CIs will not hold. In the same framework, one can also calculate the probability that the true effect will be harmful (on the other side of the null) in a next study. Table 1 presents an overview of measures of between-study heterogeneity.

View this table:

Table 1

Some frequently used measures for heterogeneity

Example: topical steroids for nasal polyps

A 2012 review on the use of topical steroids for treatment of chronic rhinosinusitis with nasal polyps, based on seven randomised studies, resulted in a larger decrease in overall symptom scores in favour of steroids compared with placebo.12 This is reflected by a standardised mean difference (SMD) of −0.51, with a 95% CI −0.96 to −0.07 (figure 1). The I² is 73.9% (95% CI 44.2% to 87.8%), which can be considered substantial heterogeneity,13 and the estimated τ² is 0.148. Notwithstanding these numbers, it is difficult to evaluate what the clinical consequences of this heterogeneity may be for future settings.

Figure 1

Forest plot of the standardised mean difference (SMD) in symptom scores in nasal polyps. Steroids versus placebo, analysis 1.1 in Cochrane Review CD006549.12 Note that our results differ from the original analysis, as we used a random-effects analysis with the Hartung-Knapp/Sidik-Jonkman adjustment16 and the empirical Bayes estimator for τ².

In order to estimate the prediction interval for the SMD, we need the point estimate of the SMD, its SE and the estimated τ². We derive the SE from the 95% CI of the SMD (see online supplementary appendix formula 1), which results in an SE of 0.227. We can calculate the SD of the prediction interval SD_PI as √(0.148+0.2272) and the lower and upper limit of the 95% prediction interval as −0.51±2.45×SD_PI. The value 2.45 results from the t_1−0.05/2,6 distribution. Prediction intervals with a different coverage could be calculated by using a different t-value, for example, t_1−0.20/2,6 for an 80% prediction interval (see online supplementary appendix formula 1).

supplementary appendix

[bmjopen-2015-010247supp_appendix.pdf]

The resulting prediction interval, ranging from −1.60 to 0.58, can be interpreted as the 95% range of true SMDs to be expected in similar studies. We present it in figure 1 as a rectangle below the diamond for the 95% CI.14 The prediction interval contains values below zero, which correspond to a decrease in symptom scores of at best ∼1.6 SD after steroid use compared with placebo. But it also contains values above zero which means that the steroids may exhibit no or even a harmful effect (SMD>0) in some settings, with a (95%) worst case increase in SMD of 0.58. Consequently, the effect in a new study may be even the exact opposite to the summary point estimate of the meta-analysis, that is, an increase of 0.51 instead of a decrease of −0.51 may occur. The estimated probability that the true effect of the steroids will be null or higher in a new study is equal to 14.7%, based on the t-distribution with 6 degrees of freedom (see online supplementary appendix formula 2).

Cochrane database

In order to investigate how often there is a discrepancy in conclusions based on prediction intervals and CIs, we evaluated this in statistically significant meta-analyses (p<0.05 by random-effects calculations) of the Cochrane Database of Systematic Reviews Issues 2009–2013, kindly provided by the UK Cochrane Editorial Unit. To avoid subjectivity in the selection, we used the first meta-analysis with a dichotomous or continuous outcome and based on at least two studies in the data and analyses section when these studies were also combined in the original review, as we wanted to reflect the status quo as precise as possible. Details can be found in another paper.15 In brief, of a total of 3263 meta-analyses, 920 were statistically significant: 479 with an estimated I²>0 and 441 with an estimated I²=0.

Calculations

We used the Hartung-Knapp/Sidik-Jonkman16 (HKSJ) random-effects meta-analysis approach combined with the empirical Bayes estimator for τ². We estimated τ² for all meta-analyses, even when the authors originally performed a fixed-effects analysis. Prediction intervals were calculated according to online supplementary appendix formula 1). We categorised the statistically significant meta-analyses with heterogeneity (τ²>0) by number of studies (2–6 studies or >6) and heterogeneity (I²<30%, 30% to 60% or >60%, based on the Cochrane Handbook13 stating that an I² between 30% and 60% corresponds to moderate heterogeneity). For significant meta-analyses where the heterogeneity estimate was zero, we assessed the impact of possibly low but non-zero heterogeneity by assuming an I² of 20%, calculating prediction intervals using online supplementary appendix formula 3). Categorical outcomes were compared between groups by means of the χ² test. We used R software(R: A language and environment for statistical computing. Retrieved from http://www.R-project.org/. [program]. Vienna, Austria: R Foundation for Statistical Computing, 2014) V.3.1.2 and the R packages metafor17 V.1.9-5 and meta (meta: General Package for Meta-Analysis. R package version 4.1-0. http://CRAN.R-project.org/package=meta [program], 2015)V.4.1-0.

Results

Overall, 132 (27.6%) of the 479 statistically significant meta-analyses with an I²>0 had both the 95% CI and the 95% prediction interval excluding the null effect (table 2). Consequently, almost three-quarter (347, 72.4%) had a prediction interval that contained the null effect. This means that it is likely that for these comparisons, some patient populations might experience null effects or effects in the opposite direction, that is, a treatment might be more harmful than the comparator even though the point estimate suggests benefit (or vice versa). Not surprisingly, significant meta-analyses with low heterogeneity more often had prediction intervals that excluded the null than meta-analyses with high heterogeneity. The percentage of prediction intervals containing the null effect was slightly higher for meta-analyses with a continuous outcome (80.4%) than for those with a dichotomous outcome (65.8%; p<0.001), but not significantly different for meta-analyses based on more than six studies (74.1%) than for those with at most six studies (69.1%; p=0.25; web table W1).

View this table:

Table 2

Proportion of statistically significant meta-analyses where both the 95% CIs and PIs excluded the null

supplementary table

[bmjopen-2015-010247supp_table.pdf]

Of the 347 meta-analyses with a prediction interval that contained the null or opposite effect, 199 (57.3%) had also at least one study with an opposite effect. This happened more often in meta-analyses with more than six studies (181/235, 77.0%) than in those based on at most six studies (18/102, 17.6%). Especially in meta-analyses with few studies and substantial heterogeneity, the prediction interval was wider than the range of study outcomes. The opposite (ie, a smaller prediction interval) occurred in meta-analyses based on many studies and with low estimated heterogeneity. Results for meta-analyses with dichotomous and continuous outcomes were not notably different.

Prediction intervals containing the opposite effect

If the prediction interval just includes the null effect, this may be less worrying than when it contains the exact opposite effect of the pooled summary effect, for example, if it contains an OR of 0.5 when the meta-analysis summary estimate is an OR of 2, or if it contains an SMD of −0.7 when the summary estimate was 0.7. Of the 479 significant meta-analyses with an I²>0,97 (20.3%) had a prediction interval that contained the opposite effect. This percentage was higher for the meta-analyses with a continuous outcome (65/219, 29.7%) than for those with a dichotomous outcome (32/260, 12.3%; p<0.001). It occurred also more frequently in meta-analyses with more than six primary studies (57/139, 41.0% and 30/178, 20.3% for meta-analyses with a continuous or dichotomous outcome, respectively) than for those based on at most six studies (8/80, 10.0% and 2/82, 2.4%; p<0.001 and p=0.001, respectively).

Meta-analyses with estimated I²=0

A substantial part of meta-analyses have an estimated I² of 0. However, there is typically very large uncertainty about the exact amount of heterogeneity, and this is demonstrated by very large 95% CIs for the values of I².18 The same applies to τ: an estimate of 0 is often accompanied by large uncertainty. The true I² and τ are unlikely to ever be exactly 0, although low values are possible. To assess the impact of possibly low but non-zero heterogeneity among the 441 Cochrane meta-analyses with estimated I²=0 and statistically significant results, we imputed an I²=20% (suggestive of low between-study heterogeneity). Under this assumption, in 329 (74.6%) of these 441 meta-analyses the 95% prediction interval would span both sides of the null (table 2), similar for meta-analyses with a dichotomous (74.7%) or continuous (74.4%) outcome (web table W1). This is a sensitivity analysis that is useful to perform to see whether the inferences of a meta-analysis that seemingly does not have detectable heterogeneity may be influenced by even a small amount of heterogeneity.

Discussion and outlook

In meta-analyses, a CI is inadequate for clinical decision-making because it only summarises the average effect for the average study. The prediction interval is more informative as it shows the range of possible effects in relation to harm and clinical benefit thresholds. While we have focused on the situation where the separating threshold is the null, a different threshold may be considered. For example, in the prediction interval framework, one can calculate the probability that an effect is larger than B, where B may be a clinically meaningful effect (if the treatment benefit is less than B, then it is felt not to be worth it). A narrow prediction interval that lies completely on the beneficial side of a clinically relevant threshold increases confidence in an intervention. A broad prediction interval may indicate the existence of settings where the treatment has a suboptimal and possibly even harmful effect. In more than 70% of statistically significant meta-analyses of the Cochrane Database with some estimated or assumed between-study heterogeneity, the prediction intervals crossed the no-effect threshold, indicating that there are settings where those treatments will have no effect or even an effect in the opposite direction. In 20.3% of those meta-analyses, the prediction interval even contained the opposite effect of the summary estimate, for example, an OR of 0.5 when the summary point estimate was an OR of 2. This occurred most frequently for meta-analyses with a continuous outcome, probably because heterogeneity can be more prominent in many topics where outcomes are assessed on continuous scales; higher heterogeneity for the continuous outcomes was also observed in the full set of 3263 meta-analyses.15 It was also slightly more common for meta-analyses based on more than six studies, probably because such meta-analyses have more power to detect smaller effects, which means that also the opposite effects will be smaller.

Graham and Moran19 evaluated prediction intervals in 72 meta-analyses with a dichotomous outcome in critical care published between 2002 and 2010. They found a higher percentage of significant meta-analyses (50/72, 69.4%), compared with 28.5% (572/2009) in our set of meta-analyses with an OR outcome. The difference may be caused by publication bias, the higher number of primary studies in their sample (median 9 vs 4 in our set15) and by their use of the DerSimonian-Laird approach which can result in too many statistically significant findings, whereas we used the HKSJ approach.16 However, results with respect to the prediction interval were remarkably similar. In 32 (64.0%) of their 50 significant meta-analyses, the 95% prediction interval included the null, similar to 65.8% in our data set. Seven (14.0%) of their 50 meta-analyses suggested a high probability of exact reversal of the efficacy or harm, similar to 12.3% of our meta-analyses where the prediction interval contained the opposite effect, despite the fact that they used a different definition for possible ‘harm’ and that they did not mention whether there was positive between-study heterogeneity in their significant meta-analyses.

It is straightforward to calculate a prediction interval if we can assume that the effects are normally distributed and that τ² is known and stable across studies. However, one should realise that the prediction interval is dependent on this assumption and on the precisions of the estimated τ² and study effect, and will be imprecise if the number of studies in the meta-analysis is small. If the number of studies is large, estimates will be more precise and the normality of the distribution of τ² can be empirically evaluated. A final caveat is that the uncertainty conveyed by the prediction interval pertains to the uncertainty about the extent to which future studies are similar (exchangeable) to those that have already been done, but this applies to all inferences from a meta-analysis. If the future studies evaluate patients and settings that are entirely different from what was evaluated in past studies, this exchangeability is questionable and uncertainty may be even more prominent than what the prediction interval conveys. In practical terms, if the patients treated by a physician are considered to be very different from the patients seen in all studies that have been done in the past, even the prediction interval cannot tell us what we might expect for these patients.

Power calculations for a future study

Meta-analysis results can also be used for power calculations for a new study. However, the expected true effect in a new study is not necessarily equal to the point estimate of the meta-analysis: it can be any of the values in the prediction interval. In case of heterogeneity, the probability of a statistically significant result in a new study may differ substantially from an apparent power of 80% based on the point estimate. The latter will be overly optimistic because the power function is asymmetric. If the true study effect is larger than the point estimate, the real probability of a significant study will be higher, up to a maximum of 100%, but if the effect is smaller, the probability may decrease substantially, even to 5% or less in case of a null effect. Consequently the expected probability of a significant new study in case of heterogeneity will be lower than 80% ( online supplementary appendix formula 4). For example, if the prediction interval shows that 30% of future studies may have a true null or negative effect, the probability of a significant new study can never be much larger than 70%. The sample size should be increased to compensate for this loss, see also Roloff et al.20

Summarising, the prediction interval reflects the variation in true treatment effects over different settings, including what effect is to be expected in future patients, such as the patients that a clinician is interested to treat. Therefore, it should be routinely reported in addition to the summary effect and its CI, and used as a main tool for interpreting evidence, to enable more informed clinical decision-making.

References

↵
1. Riley RD,
2. Higgins JPT,
3. Deeks JJ
. Interpretation of random effects meta-analyses. BMJ 2011;342:d549. doi:10.1136/bmj.d549
OpenUrl FREE Full Text
↵
1. Thompson SG,
2. Higgins JPT
. How should meta-regression analyses be undertaken and interpreted? Stat Med 2002;21:1559–73. doi:10.1002/sim.1187
OpenUrl CrossRef PubMed Web of Science
↵
1. Higgins JP,
2. Thompson SG,
3. Deeks JJ, et al
. Measuring inconsistency in meta-analyses. BMJ 2003;327:557–60. doi:10.1136/bmj.327.7414.557
OpenUrl FREE Full Text
↵
1. Higgins JP,
2. Thompson SG,
3. Spiegelhalter DJ
. A re-evaluation of random-effects meta-analysis. J R Stat Soc Ser A Stat Soc 2009;172:137–59. doi:10.1111/j.1467-985X.2008.00552.x
OpenUrl CrossRef PubMed
↵
1. Saha S,
2. Chant D,
3. McGrath J
. Meta-analyses of the incidence and prevalence of schizophrenia: conceptual and methodological issues. Int J Methods Psychiatr Res 2008;17:55–61. doi:10.1002/mpr.240
OpenUrl CrossRef PubMed
↵
1. Borenstein M,
2. Hedges LV,
3. Higgins JPT, et al
. Introduction to meta-analysis. Chichester, UK: Wiley, 2009.
↵
1. Melsen WG,
2. Bootsma MCJ,
3. Rovers MM, et al
. The effects of clinical and statistical heterogeneity on the predictive values of results from meta-analyses. Clin Microbiol Infect 2014;20:123–9. doi:10.1111/1469-0691.12494
OpenUrl
↵
1. Rücker G,
2. Schwarzer G,
3. Carpenter JR, et al
. Undue reliance on I² in assessing heterogeneity may mislead. BMC Med Res Methodol 2008;8:79. doi:10.1186/1471-2288-8-79
OpenUrl CrossRef PubMed
↵
1. Moonesinghe R,
2. Khoury MJ,
3. Liu T, et al
. Required sample size and nonreplicability thresholds for heterogeneous genetic associations. Proc Natl Acad Sci USA 2008;105:617–22. doi:10.1073/pnas.0705554105
OpenUrl Abstract/FREE Full Text
↵
1. Chiolero A,
2. Santschi V,
3. Burnand B, et al
. Meta-analyses: with confidence or prediction intervals? Eur J Epidemiol 2012;27:823–5. doi:10.1007/s10654-012-9738-y
OpenUrl CrossRef PubMed
↵
1. Wilmot EG,
2. Edwardson CL,
3. Achana FA, et al
. Sedentary time in adults and the association with diabetes, cardiovascular disease and death: systematic review and meta-analysis. Diabetologia 2012;55:2895–905. doi:10.1007/s00125-012-2677-z
OpenUrl CrossRef PubMed Web of Science
↵
1. Kalish L,
2. Snidvongs K,
3. Sivasubramaniam R, et al
. Topical steroids for nasal polyps. Cochrane Database Syst Rev 2012;12:CD006549. doi:10.1002/14651858.CD006549.pub2
OpenUrl PubMed
↵
1. Higgins JPT,
2. Green S,
3. Collaboration C
. Cochrane handbook for systematic reviews of interventions. Wiley Online Library, 2008.
↵
1. Guddat C,
2. Grouven U,
3. Bender R, et al
. A note on the graphical presentation of prediction intervals in random-effects meta-analyses. Syst Rev 2012;1:34. doi:10.1186/2046-4053-1-34
OpenUrl CrossRef PubMed
↵
1. IntHout J,
2. Ioannidis JPA,
3. Borm GF, et al
. Small studies are more heterogeneous than large ones: a meta-meta-analysis. J Clin Epidemiol 2015;68:860–9. doi:10.1016/j.jclinepi.2015.03.017
OpenUrl
↵
1. IntHout J,
2. Ioannidis JP,
3. Borm GF
. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Med Res Methodol 2014;14:25. doi:10.1186/1471-2288-14-25
OpenUrl
↵
1. Viechtbauer W
. Conducting meta-analyses in R with the metafor package. J Stat Software 2010;36:1–48. doi:10.18637/jss.v036.i03
OpenUrl
↵
1. Ioannidis JPA,
2. Patsopoulos NA,
3. Evangelou E
. Uncertainty in heterogeneity estimates in meta-analyses. BMJ 2007;335:914–16. doi:10.1136/bmj.39343.408449.80
OpenUrl FREE Full Text
↵
1. Graham PL,
2. Moran JL
. Robust meta-analytic conclusions mandate the provision of prediction intervals in meta-analysis summaries. J Clin Epidemiol 2012;65:503–10. doi:10.1016/j.jclinepi.2011.09.012
OpenUrl CrossRef PubMed
↵
1. Roloff V,
2. Higgins JP,
3. Sutton AJ
. Planning future studies based on the conditional power of a meta-analysis. Stat Med 2013;32:11–24. doi:10.1002/sim.5524
OpenUrl CrossRef PubMed

Footnotes

Contributors JI originated the idea for this study together with JJG. JI drafted the manuscript and conducted the data analysis. All authors read and critically revised the manuscript for important intellectual content and approved the final manuscript.
Funding This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement Data sets are available on request from the corresponding author.

[1] ↵
Riley RD,
Higgins JPT,
Deeks JJ
. Interpretation of random effects meta-analyses. BMJ 2011;342:d549. doi:10.1136/bmj.d549
OpenUrl FREE Full Text

[2] Riley RD,

[3] Higgins JPT,

[4] Deeks JJ

[5] ↵
Thompson SG,
Higgins JPT
. How should meta-regression analyses be undertaken and interpreted? Stat Med 2002;21:1559–73. doi:10.1002/sim.1187
OpenUrl CrossRef PubMed Web of Science

[6] Thompson SG,

[7] Higgins JPT

[8] ↵
Higgins JP,
Thompson SG,
Deeks JJ, et al
. Measuring inconsistency in meta-analyses. BMJ 2003;327:557–60. doi:10.1136/bmj.327.7414.557
OpenUrl FREE Full Text

[9] Higgins JP,

[10] Thompson SG,

[11] Deeks JJ, et al

[12] ↵
Higgins JP,
Thompson SG,
Spiegelhalter DJ
. A re-evaluation of random-effects meta-analysis. J R Stat Soc Ser A Stat Soc 2009;172:137–59. doi:10.1111/j.1467-985X.2008.00552.x
OpenUrl CrossRef PubMed

[13] Higgins JP,

[14] Thompson SG,

[15] Spiegelhalter DJ

[16] ↵
Saha S,
Chant D,
McGrath J
. Meta-analyses of the incidence and prevalence of schizophrenia: conceptual and methodological issues. Int J Methods Psychiatr Res 2008;17:55–61. doi:10.1002/mpr.240
OpenUrl CrossRef PubMed

[17] Saha S,

[18] Chant D,

[19] McGrath J

[20] ↵
Borenstein M,
Hedges LV,
Higgins JPT, et al
. Introduction to meta-analysis. Chichester, UK: Wiley, 2009.

[21] Borenstein M,

[22] Hedges LV,

[23] Higgins JPT, et al

[24] ↵
Melsen WG,
Bootsma MCJ,
Rovers MM, et al
. The effects of clinical and statistical heterogeneity on the predictive values of results from meta-analyses. Clin Microbiol Infect 2014;20:123–9. doi:10.1111/1469-0691.12494
OpenUrl

[25] Melsen WG,

[26] Bootsma MCJ,

[27] Rovers MM, et al

[28] ↵
Rücker G,
Schwarzer G,
Carpenter JR, et al
. Undue reliance on I² in assessing heterogeneity may mislead. BMC Med Res Methodol 2008;8:79. doi:10.1186/1471-2288-8-79
OpenUrl CrossRef PubMed

[29] Rücker G,

[30] Schwarzer G,

[31] Carpenter JR, et al

[32] ↵
Moonesinghe R,
Khoury MJ,
Liu T, et al
. Required sample size and nonreplicability thresholds for heterogeneous genetic associations. Proc Natl Acad Sci USA 2008;105:617–22. doi:10.1073/pnas.0705554105
OpenUrl Abstract/FREE Full Text

[33] Moonesinghe R,

[34] Khoury MJ,

[35] Liu T, et al

[36] ↵
Chiolero A,
Santschi V,
Burnand B, et al
. Meta-analyses: with confidence or prediction intervals? Eur J Epidemiol 2012;27:823–5. doi:10.1007/s10654-012-9738-y
OpenUrl CrossRef PubMed

[37] Chiolero A,

[38] Santschi V,

[39] Burnand B, et al

[40] ↵
Wilmot EG,
Edwardson CL,
Achana FA, et al
. Sedentary time in adults and the association with diabetes, cardiovascular disease and death: systematic review and meta-analysis. Diabetologia 2012;55:2895–905. doi:10.1007/s00125-012-2677-z
OpenUrl CrossRef PubMed Web of Science

[41] Wilmot EG,

[42] Edwardson CL,

[43] Achana FA, et al

[44] ↵
Kalish L,
Snidvongs K,
Sivasubramaniam R, et al
. Topical steroids for nasal polyps. Cochrane Database Syst Rev 2012;12:CD006549. doi:10.1002/14651858.CD006549.pub2
OpenUrl PubMed

[45] Kalish L,

[46] Snidvongs K,

[47] Sivasubramaniam R, et al

[48] ↵
Higgins JPT,
Green S,
Collaboration C
. Cochrane handbook for systematic reviews of interventions. Wiley Online Library, 2008.

[49] Higgins JPT,

[50] Green S,

[51] Collaboration C

[52] ↵
Guddat C,
Grouven U,
Bender R, et al
. A note on the graphical presentation of prediction intervals in random-effects meta-analyses. Syst Rev 2012;1:34. doi:10.1186/2046-4053-1-34
OpenUrl CrossRef PubMed

[53] Guddat C,

[54] Grouven U,

[55] Bender R, et al

[56] ↵
IntHout J,
Ioannidis JPA,
Borm GF, et al
. Small studies are more heterogeneous than large ones: a meta-meta-analysis. J Clin Epidemiol 2015;68:860–9. doi:10.1016/j.jclinepi.2015.03.017
OpenUrl

[57] IntHout J,

[58] Ioannidis JPA,

[59] Borm GF, et al

[60] ↵
IntHout J,
Ioannidis JP,
Borm GF
. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Med Res Methodol 2014;14:25. doi:10.1186/1471-2288-14-25
OpenUrl

[61] IntHout J,

[62] Ioannidis JP,

[63] Borm GF

[64] ↵
Viechtbauer W
. Conducting meta-analyses in R with the metafor package. J Stat Software 2010;36:1–48. doi:10.18637/jss.v036.i03
OpenUrl

[65] Viechtbauer W

[66] ↵
Ioannidis JPA,
Patsopoulos NA,
Evangelou E
. Uncertainty in heterogeneity estimates in meta-analyses. BMJ 2007;335:914–16. doi:10.1136/bmj.39343.408449.80
OpenUrl FREE Full Text

[67] Ioannidis JPA,

[68] Patsopoulos NA,

[69] Evangelou E

[70] ↵
Graham PL,
Moran JL
. Robust meta-analytic conclusions mandate the provision of prediction intervals in meta-analysis summaries. J Clin Epidemiol 2012;65:503–10. doi:10.1016/j.jclinepi.2011.09.012
OpenUrl CrossRef PubMed

[71] Graham PL,

[72] Moran JL

[73] ↵
Roloff V,
Higgins JP,
Sutton AJ
. Planning future studies based on the conditional power of a meta-analysis. Stat Med 2013;32:11–24. doi:10.1002/sim.5524
OpenUrl CrossRef PubMed

[74] Roloff V,

[75] Higgins JP,

[76] Sutton AJ

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Strengths and limitations of this study

Introduction

Methods

Interpretation of heterogeneity

Prediction intervals

Example: topical steroids for nasal polyps

supplementary appendix

Cochrane database

Calculations

Results

supplementary table

Prediction intervals containing the opposite effect

Meta-analyses with estimated I2=0

Discussion and outlook

Power calculations for a future study

References

Footnotes

Read the full text or download the PDF:

Log in using your username and password

Meta-analyses with estimated I²=0