Elsevier

Measurement

Volume 38, Issue 1, July 2005, Pages 61-66
Measurement

A united interpretation of different uncertainty intervals

https://doi.org/10.1016/j.measurement.2005.01.001Get rights and content

Abstract

The frequentist and Bayesian philosophies of statistical inference require different approaches to the calculation of an interval of uncertainty for a measurand. A frequentist (or classical) interval will have an associated confidence level, p, that is the probability of generating an interval enclosing the value of the measurand. A Bayesian interval will have an associated credible level, p, that is a ‘degree-of-belief’ that the value of the measurand subsequently lies within the interval. Since potential users are not primarily concerned with the method of analysis, a shared interpretation of the information given to them seems desirable. We obtain such an interpretation by recognising that in either case p is the proportion of independent intervals calculated over time that contain their respective measurands. This interpretation is also useful in explaining an interval calculated according to the procedure of Guide to the Expression of Uncertainty in Measurement.

Introduction

Different understandings of probability can lead to different statistical analyses of measurement data, and hence to different interpretations of the measurement uncertainty. This paper is primarily concerned with two such understandings, as represented by the frequentist (or classical) and Bayesian frameworks of statistical inference.

In the frequentist framework the quantities of interest are regarded as fixed statistical parameters to be estimated using data, and not as quantities possessing probability density functions (pdfs). In uncertainty analysis, this approach asks the question “what range of (fixed) values for the measurand could have given rise to the data?” The result is a 100p% confidence interval for the measurand,1 typically with p = 0.95. Several recent papers in the metrological literature have adopted this approach, e.g., [1], [2].

In Bayesian statistics pdfs are attached to all quantities of interest. These pdfs are first constructed from informed opinions, subsequently updated using the sampling model and data, and finally combined appropriately to obtain the distribution of the measurand. An interval containing 100p% of the mass of this distribution is often called a 100p% credible interval for the measurand [3], [4].2 There are also several recent papers written within this framework, e.g., [5], [6].

The frequentist approach deals adequately with so-called ‘random’ errors. However, Bayesian methodology seems preferable for the incorporation of ‘systematic’ errors. In practice, a measurement involves both types of error. The procedure for uncertainty analysis advocated in the Guide to the Expression of Uncertainty in Measurement [7], which results in an interval of ‘expanded uncertainty’, seems neither strictly frequentist nor strictly Bayesian [4]. Partly because of the utility of this procedure, a shared interpretation is required for confidence intervals and credible intervals, both of which might be called intervals of uncertainty. This paper presents such an interpretation designed for clear communication to those who will make use of the interval.

The suggestion that there is no obvious existing shared interpretation may appear surprising. However, the paragraphs above indicate that the level of certainty p relates to different concepts in the two cases given. So what should be a client’s understanding of a figure, say p = 0.95, associated with the uncertainty interval on a calibration certificate? And if that figure needs no communicable meaning then why not prefer p = 0.90, because this would give intervals that were narrower and hence more informative to the client about the actual value of the measurand? So, to what meaning of p is the analysis accountable?

We do not discuss the superiority of either the frequentist or Bayesian framework over the other. Surely, both have advantages and disadvantages; else there would be a unique way of evaluating the uncertainty of measurement. Rather, our purpose is to point out a single interpretation of the corresponding intervals to assist the users of measurement results and to encourage their confidence in measurement processes. We suggest that both a 100p% confidence interval and a 100p% credible interval may be interpreted as an interval the like of which contains the value of the relevant measurand on 100p% of independent occasions. (Alternative wordings are given later.) In our experience this simple interpretation is not often made explicit.

We begin by considering some different definitions of probability, including a seemingly natural way of quantifying the concept of degree-of-belief.

Section snippets

Interpretation of probability

The concept of probability has been the subject of much debate; see e.g., [8], [9], [10] for interesting discussions and [11], [12], [13] for works of well-known authors. Let us consider three possible loose definitions (cf. [8], [14], [15], [16]):

  • A.

    The probability of an event E is the limiting proportion of times that E occurs from a long series of independent identical opportunities.

  • B.

    The probability of an event E is the reciprocal of the number of equally-likely events that are mutually

Interpretation of interval statements

We are now in a position to examine how uncertainty intervals may be interpreted using option A above.

First, consider a classical analysis that results in a confidence interval [a, b] for the value of the measurand with a confidence level of 95%. We may say that “there is 95% confidence that [a, b] encloses the value of the measurand”.

On independence

We have indicated that a p-level uncertainty interval is one the like of which will be successful, on average, on 100p out of every 100 independent occasions. Despite the usage of terms like ‘independent occasions’, the requirement of independence actually relates to the errors incurred in the measurement processes. If these unknowable errors are realizations of variables that are statistically independent from occasion to occasion then the long-run proportion of successful intervals will be p.

Conclusion

The result of an uncertainty analysis for a measurand is often in the form of an interval estimate, obtained using either the classical or Bayesian statistical framework, or using the methodology advocated in the Guide to the Expression of Uncertainty in Measurement [7], which mixes elements of both frameworks. Though the concepts of probability in the classical and Bayesian frameworks differ, the corresponding intervals can be given the same interpretation using the concept of relative

Acknowledgements

This work was supported by a grant from the RSNZ Bilateral Research Activities Programme under contract 02-BRAP-54-WILL. The authors are also grateful to L. Christian of Measurement Standards Laboratory (MSL) of New Zealand for helpful suggestions in the course of this work.

References (17)

  • B.D. Hall et al.

    Does “Welch-Satterthewaite” make a good uncertainty estimate?

    Metrologia

    (2001)
  • R. Willink et al.

    A classical method for uncertainty analysis with multidimensional data

    Metrologia

    (2002)
  • W. Edwards et al.

    Bayesian statistical inference for psychological research

  • L.J. Gleser

    Assessing uncertainty in measurement

    Statistical Science

    (1998)
  • I. Lira et al.

    Bayesian inference from measurement information

    Metrologia

    (1999)
  • I. Lira et al.

    Bayesian evaluation of standard uncertainty and coverage probability in a simple measurement model

    Meas. Sci. Technol.

    (2001)
  • Guide to the Expression of Uncertainty in Measurement, Geneva, International Organization for Standardization,...
  • B. de Finetti

    Probability: interpretations

There are more references available in the full text version of this article.

Cited by (0)

View full text