Stepwise model fitting and statistical inference: turning noise into signal pollution

Am Nat. 2009 Jan;173(1):119-23. doi: 10.1086/593303.

Abstract

Statistical inference based on stepwise model selection is applied regularly in ecological, evolutionary, and behavioral research. In addition to fundamental shortcomings with regard to finding the "best" model, stepwise procedures are known to suffer from a multiple-testing problem, yet the method is still widely used. As an illustration of this problem, we present results of a simulation study of artificial data sets of uncorrelated variables, with two to 10 predictor variables and one dependent variable. We then compared results from stepwise regression with a regression model in which all predictor variables were entered simultaneously. These analyses clearly demonstrate that significance tests based on stepwise procedures lead to greatly inflated Type I error rates (i.e., the probability of erroneously rejecting a true null hypothesis). By using a simple simulation design, our study amplifies previous warnings about using stepwise procedures, and we follow others in recommending that biologists refrain from applying these methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biology / methods*
  • Computer Simulation
  • Linear Models*
  • Models, Biological*
  • Statistics as Topic / methods