Assessing calibration of prognostic risk scores

Stat Methods Med Res. 2016 Aug;25(4):1692-706. doi: 10.1177/0962280213497434. Epub 2013 Jul 30.

Abstract

Current methods used to assess calibration are limited, particularly in the assessment of prognostic models. Methods for testing and visualizing calibration (e.g. the Hosmer-Lemeshow test and calibration slope) have been well thought out in the binary regression setting. However, extension of these methods to Cox models is less well known and could be improved. We describe a model-based framework for the assessment of calibration in the binary setting that provides natural extensions to the survival data setting. We show that Poisson regression models can be used to easily assess calibration in prognostic models. In addition, we show that a calibration test suggested for use in survival data has poor performance. Finally, we apply these methods to the problem of external validation of a risk score developed for the general population when assessed in a special patient population (i.e. patients with particular comorbidities, such as rheumatoid arthritis).

Keywords: Cox model; Poisson; calibration; prognostic risk scores; standardized incidence ratio; survival.

MeSH terms

  • Adult
  • Arthritis, Rheumatoid
  • Calibration*
  • Humans
  • Prognosis*
  • Proportional Hazards Models*
  • Reproducibility of Results
  • Risk