Elsevier

Controlled Clinical Trials

Volume 22, Issue 2, February 2001, Pages 102-110
Controlled Clinical Trials

Calculating Confidence Intervals for the Number Needed to Treat

https://doi.org/10.1016/S0197-2456(00)00134-3Get rights and content

Abstract

The number needed to treat (NNT) has gained much attention in the past years as a useful way of reporting the results of randomized controlled trials with a binary outcome. Defined as the reciprocal of the absolute risk reduction (ARR), NNT is the estimated average number of patients needed to be treated to prevent an adverse outcome in one additional patient. As with other estimated effect measures, it is important to document the uncertainty of the estimation by means of an appropriate confidence interval. Confidence intervals for NNT can be obtained by inverting and exchanging the confidence limits for the ARR provided that the NNT scale ranging from 1 through ∞ to −1 is taken into account. Unfortunately, the only method used in practice to calculate confidence intervals for ARR seems to be the simple Wald method, which yields too short confidence intervals in many cases. In this paper it is shown that the application of the Wilson score method improves the calculation and presentation of confidence intervals for the number needed to treat. Control Clin Trials 2001;22:102–110

Introduction

The number needed to treat (NNT) has gained much attention in the past years as a useful way of reporting the results of randomized controlled trials with a binary outcome 1, 2, 3. Defined as the reciprocal of the absolute risk reduction (ARR), the number needed to treat is the estimated average number of patients needed to be treated to prevent an adverse outcome in one additional patient. A negative NNT is the estimated average number of patients needed to be treated with the new rather than the standard treatment for one additional patient to be harmed. While this measure is often better understood than risk ratios or risk reductions by clinicians and patients, the NNT has undesirable mathematical and statistical properties. The understanding of the confidence interval for NNT is not straightforward. However, an excellent explanation was recently given by Altman [4]. The mathematical and statistical properties of the NNT statistic are described in more detail by Lesaffre and Pledger [5].

The key to understanding the confidence interval for NNT is that principally the domain of NNT is the union of 1 to ∞ and −∞ to −1. The best value of NNT indicating the largest possible beneficial treatment effect is 1, the NNT value indicating no treatment effect (ARR = 0) is ±∞, and the worst NNT value indicating the largest possible harmful effect is −1. Thus, the result NNT = 10 with confidence limits 4 and −20 means that the two regions 4 to ∞ and −20 to −∞ form the confidence interval. Altman proposed to use two new abbreviations, namely number needed to treat for one patient to benefit (NNTB) or be harmed (NNTH) [4]. This concept avoids the awkward term “number needed to harm” (NNH), which is used, for example, in the journal Evidence-Based Medicine. The result of an estimated NNT with confidence interval can then be presented as NNTB = 10 (NNTB 4 to ∞ to NNTH 20) [4].

Altman recommended that a confidence interval should always be given when an NNT is reported as a study result [4]. However, the usual Wald method for calculating such confidence intervals is frequently inappropriate. By using examples from the literature and artificial examples, it is shown that the application of the Wilson score method [6] improves the calculation and presentation of confidence intervals for the number needed to treat.

Section snippets

Methods to calculate confidence intervals for nnt

Let π1 and π2 be the true probabilities (risks) of an adverse event in the control group (group 1) and the treatment group (group 2), respectively. The true ARR is the difference of the two risks π1 − π2. The true NNT is the reciprocal 1/(π1 − π2) of the true ARR. To estimate these measures a randomized clinical trial can be performed. Let n1 and n2 be the number of patients randomized in the control group and the treatment group, respectively, and let e1 and e2 be the number of patients having

Shortcomings of the simple wald method

Principally, the shortcomings of the Wald confidence intervals transmit from ARR to NNT. However, for interpretation the NNT scale has to be taken into account. In the following the confidence intervals for NNT based on Wilson scores are compared with the Wald confidence intervals by means of published and artificial examples. The published examples are estimated NNT values found in the journal Evidence-Based Medicine 18, 19, 20, 21. Here, we concentrate on the comparison of the confidence

Using nnt for equivalence trials

The possible aberrations of the simple Wald method to calculate confidence intervals for ARR and NNT are meaningful especially for equivalence trials [22]. To demonstrate equivalence in therapeutic clinical trials the use of confidence intervals with coverage probability of 95% or more is recommended [23]. Frequently, the objective of a study is to show that the new treatment is not inferior to the standard treatment. In such trials, one possibility to demonstrate equivalence between treatments

Discussion and conclusion

NNT has become a popular summary statistic to describe the absolute effect of a given treatment in comparison to a standard treatment or control. It was first introduced for use in randomized placebo-controlled clinical trials [24], then adopted as the primary outcome measure for systematic reviews such as meta-analyses [25], extended to the statistic “number needed to screen” to compare strategies for disease screening [26], and is now applied also in epidemiology to express the magnitude of

Acknowledgements

I thank Robert G. Newcombe for his valuable and helpful comments, which improved the paper considerably.

References (34)

  • E. Lesaffre et al.

    A note on the number needed to treat

    Control Clin Trials

    (1999)
  • R.J. Cook et al.

    The number needed to treatA clinically useful measure of treatment effect

    BMJ

    (1995)
  • D.L. Sackett

    On some clinically useful measures of the effects of treatment

    Evidence-Based Med

    (1996)
  • G. Chatellier et al.

    The number needed to treatA clinically useful nomogram in its proper context

    BMJ

    (1996)
  • D.G. Altman

    Confidence intervals for the number needed to treat

    BMJ

    (1998)
  • R.G. Newcombe

    Interval estimation for the difference between independent proportionsComparison of eleven methods

    Stat Med

    (1998)
  • L.E. Daly

    Confidence limits made easyInterval estimation using a substitution method

    Am J Epidemiol

    (1998)
  • O.S. Miettinen et al.

    Comparative analysis of two rates

    Stat Med

    (1985)
  • S.L. Beal

    Asymptotic confidence intervals for the difference between binomial parameters for the use with small samples

    Biometrics

    (1987)
  • S. Wallenstein

    A non-iterative accurate asymptotic confidence interval for the difference between two proportions

    Stat Med

    (1997)
  • I.E. Buchan

    Computer software that can calculate confidence intervals is now available (letter)

    BMJ

    (1995)
  • M.J. Gardner et al.

    Confidence intervals rather than P valuesEstimating rather than hypothesis testing

    BMJ

    (1986)
  • C.R. Mehta et al.

    StatXact 4 for Windows. Statistical Software for Exact Nonparametric Inference

    (1999)
  • A. Agresti et al.

    Approximate is better than “exact” for interval estimation of binomial proportions

    Am Statistn

    (1998)
  • S.E. Vollset

    Confidence intervals for a binomial proportion

    Stat Med

    (1993)
  • R.G. Newcombe

    Two-sided confidence intervals for the single proportionComparison of seven methods

    Stat Med

    (1998)
  • SAS/IML User's Guide, Version 5 Edition

    (1985)
  • Cited by (163)

    View all citing articles on Scopus
    View full text