Boosted trees for ecological modeling and prediction

Ecology. 2007 Jan;88(1):243-51. doi: 10.1890/0012-9658(2007)88[243:btfema]2.0.co;2.

Abstract

Accurate prediction and explanation are fundamental objectives of statistical analysis, yet they seldom coincide. Boosted trees are a statistical learning method that attains both of these objectives for regression and classification analyses. They can deal with many types of response variables (numeric, categorical, and censored), loss functions (Gaussian, binomial, Poisson, and robust), and predictors (numeric, categorical). Interactions between predictors can also be quantified and visualized. The theory underpinning boosted trees is presented, together with interpretive techniques. A new form of boosted trees, namely, "aggregated boosted trees" (ABT), is proposed and, in a simulation study, is shown to reduce prediction error relative to boosted trees. A regression data set is analyzed using ABT to illustrate the technique and to compare it with other methods, including boosted trees, bagged trees, random forests, and generalized additive models. A software package for ABT analysis using the R software environment is included in the Appendices together with worked examples.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Ecology* / methods
  • Models, Statistical
  • Statistics as Topic* / methods