PERT: a method for expression deconvolution of human blood samples from varied microenvironmental and developmental conditions

PLoS Comput Biol. 2012;8(12):e1002838. doi: 10.1371/journal.pcbi.1002838. Epub 2012 Dec 20.

Abstract

The cellular composition of heterogeneous samples can be predicted using an expression deconvolution algorithm to decompose their gene expression profiles based on pre-defined, reference gene expression profiles of the constituent populations in these samples. However, the expression profiles of the actual constituent populations are often perturbed from those of the reference profiles due to gene expression changes in cells associated with microenvironmental or developmental effects. Existing deconvolution algorithms do not account for these changes and give incorrect results when benchmarked against those measured by well-established flow cytometry, even after batch correction was applied. We introduce PERT, a new probabilistic expression deconvolution method that detects and accounts for a shared, multiplicative perturbation in the reference profiles when performing expression deconvolution. We applied PERT and three other state-of-the-art expression deconvolution methods to predict cell frequencies within heterogeneous human blood samples that were collected under several conditions (uncultured mono-nucleated and lineage-depleted cells, and culture-derived lineage-depleted cells). Only PERT's predicted proportions of the constituent populations matched those assigned by flow cytometry. Genes associated with cell cycle processes were highly enriched among those with the largest predicted expression changes between the cultured and uncultured conditions. We anticipate that PERT will be widely applicable to expression deconvolution strategies that use profiles from reference populations that vary from the corresponding constituent populations in cellular state but not cellular phenotypic identity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Fetal Blood / metabolism*
  • Flow Cytometry
  • Gene Expression Profiling / methods*
  • Humans
  • Least-Squares Analysis
  • Likelihood Functions
  • Oligonucleotide Array Sequence Analysis

Grants and funding

This work was funded by Natural Science and Engineering Research Council operating grants to QM and PWZ, a Canadian Stem Cell Network grant to PWZ, a Early Researcher Award to QM, and an Ontario Graduate Scholarship to WQ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.