Elsevier

Water Research

Volume 44, Issue 16, September 2010, Pages 4726-4735
Water Research

Improved strategies and optimization of calibration models for real-time PCR absolute quantification

https://doi.org/10.1016/j.watres.2010.07.066Get rights and content

Abstract

Real-time PCR absolute quantification applications are becoming more common in the recreational and drinking water quality industries. Many methods rely on the use of standard curves to make estimates of DNA target concentrations in unknown samples. Traditional absolute quantification approaches dictate that a standard curve must accompany each experimental run. However, the generation of a standard curve for each qPCR experiment set-up can be expensive and time consuming, especially for studies with large numbers of unknown samples. As a result, many researchers have adopted a master calibration strategy where a single curve is derived from DNA standard measurements generated from multiple instrument runs. However, a master curve can inflate uncertainty associated with intercept and slope parameters and decrease the accuracy of unknown sample DNA target concentration estimates. Here we report two alternative strategies termed ‘pooled’ and ‘mixed’ for the generation of calibration equations from absolute standard curves which can help reduce the cost and time of laboratory testing, as well as the uncertainty in calibration model parameter estimates. In this study, four different strategies for generating calibration models were compared based on a series of repeated experiments for two different qPCR assays using a Monte Carlo Markov Chain method. The hierarchical Bayesian approach allowed for the comparison of uncertainty in intercept and slope model parameters and the optimization of experiment design. Data suggests that the ‘pooled’ model can reduce uncertainty in both slope and intercept parameter estimates compared to the traditional single curve approach. In addition, the ‘mixed’ model achieved uncertainty estimates similar to the ‘single’ model while increasing the number of available reaction wells per instrument run.

Introduction

Real-time quantitative PCR (qPCR) represents a powerful tool for the detection and quantification of DNA. qPCR applications are becoming more widespread in clinical, forensic, food safety, and environmental applications due to high levels of sensitivity, specificity, and precision. These methods typically consist of several protocols linked in succession including processes such as sample collection, sample preparation, nucleic acid purification, target amplification, and data interpretation. Each of these steps can introduce uncertainty into the final concentration estimate from an unknown test sample. Recently, qPCR applications designed to estimate fecal bacteria concentrations in ambient water systems are gaining widespread attention due to the rapid nature of these methodologies (same day results), reports linking the occurrence of qPCR enumerated fecal bacteria genetic markers to recreational bather health risk (Haugland et al., 2005, Wade et al., 2008, Wade et al., 2006), and the availability of assays that can discriminate between different animal sources of fecal pollution (Caldwell et al., 2007, Kildare et al., 2007, Kirs and Smith, 2007, Layton et al., 2006, McQuaig et al., 2006, Okabe et al., 2007, Reischer et al., 2006, Seurinck et al., 2005, Shanks et al., 2008, Shanks et al., 2009). As a result, numerous studies have been conducted addressing issues such as loss of target during nucleic acid recovery (Haugland et al., 2005, Mumy and Findaly, 2004, Rajal et al., 2007, Stoeckel et al., 2009), sample matrix interference during amplification (Leach et al., 2008, Rajal et al., 2007, Shanks et al., 2010, Volkmann et al., 2007), density and distribution of target genetic markers in primary and secondary sources (Dick et al., 2005, Kildare et al., 2007, Rajal et al., 2007, Shanks et al., 2010, Silkie and Nelson, 2009), correction models for non-specific amplification (Leach et al., 2008, Rajal et al., 2007), and estimating decay of nucleic acids in various environmental matrices (Bae and Wuertz, 2009, Bell et al., 2009, Okabe and Shimazu, 2007, Walters et al., 2009). However, the selection of a mathematical model to transform qPCR raw data from a cycle threshold (CT) or crossing point (Cp) to an estimate of concentration from an unknown sample is often overlooked. Model selection will not only determine what sources of uncertainty are incorporated into unknown concentration estimates, but can also impact the amount of time and money spent to complete a study.

Currently available models are classically organized into two general strategies including relative and absolute approaches (Applied Biosystems, 2006). A relative quantification approach measures the change in DNA target concentration relative to another reference target. This approach is ideal for gene expression studies where the goal is to measure the regulation of a gene in response to a particular treatment. However, a relative approach can be limiting for many environmental applications where the DNA target of interest has no clear connection to a reference target, such as assays where the DNA target is from an uncharacterized microorganism. A relative approach can also be difficult to implement on an inter-laboratory scale for applications with a low abundance DNA target from poorly described, complex sample matrices such as those found in sediments, ambient waters, and other environmental samples.

Absolute quantification is another widely used strategy and is achieved by developing a calibration model based on repeated CT or Cp measurements of a series of DNA target molecules with known concentrations. Once a calibration model is developed, a CT or Cp value obtained from an unknown sample can be used to estimate the initial concentration of the DNA target of interest. However, a calibration curve must be thoroughly validated because the accuracy of the quantification estimate is entirely dependent on the accuracy of the known DNA standards. Factors such as DNA standard source (ie. recombinant plasmid, genomic DNA, or synthesized oligonucleotide), nucleotide base composition, exact concentration determination, dilution preparation, and stability during storage can all introduce uncertainty into a qPCR calibration model. Traditional absolute quantification approaches dictate that a standard curve must accompany each instrument run. However, the generation of a standard curve for each qPCR experiment set-up can be expensive and time consuming, especially for studies with large numbers of unknown samples. For example, a typical standard curve requires 15–18 reaction wells to generate triplicate CT or Cp measurements for five or six different standard concentrations. This can be a substantial proportion of the total number of available reaction wells for qPCR instruments with 96 or less reaction wells (≥19%). The number of unknown samples that can be included in each run is even further reduced when other experiment controls are considered such as extraction blanks, positive controls, and no template controls.

Several mathematical models have been proposed to estimate the DNA concentration from an unknown sample where an absolute calibration curve model is developed from a series of DNA standards with known concentrations (ie. recombinant plasmid copy number) and their associated CT or Cp measurements (Ibekwe et al., 2002, Martin et al., 2006). The most common approach employs a ‘single’ set of DNA standard CT or Cp measurements for each instrument run. In this case, a series of known DNA concentration standards and several unknown DNA samples are analyzed in the same instrument run. Therefore, if the unknown samples are from the ith run, then the calibration curve data corresponding to the ith run is used to estimate the unknown samples. Another popular strategy utilizes a ‘master’ calibration curve derived from DNA standard CT or Cp measurements generated from multiple instrument runs. These data are then used to generate a calibration curve for the estimate of unknown DNA target concentrations analyzed over the course of a study without the prerequisite that unknown test samples and corresponding calibration curve data originate from the same instrument run.

Previous studies report from repeated instrument runs of the same calibration curve that there are often minor variations in the slope (<3%), but significant differences between intercept values (p < 0.05) (Pfaffl and Hageleit, 2001, Sivaganesan et al., 2008). The intra-assay variability in intercept values are most likely due to the accumulation of minor variations in day-to-day reagent mixing, reagent lot, solution pipetting, thermal cycling, fluorescence detection, as well as quantification, dilution preparation, and storage of calibration curve standards. Thus, a ‘single’ curve model ignores sources of run-to-run variability and can potentially lead to different estimates of unknown DNA target concentrations from one instrument run to another. In contrast, a ‘master’ curve model incorporates run-to-run variability typically increasing the uncertainty in unknown sample concentration estimates compared to the ‘single’ curve approach. Alternative models for the generation of a calibration curve using CT or Cp measurements that do not require a complete set of calibration curve standards included on each instrument run or that incorporate standards data from all instrument runs over the course of a study may be able to increase confidence in unknown sample DNA concentration estimates and substantially reduce the cost and time of analysis.

Here we report two alternative models for the generation of absolute calibration curves from standards designated as ‘pooled’ and ‘mixed’. A detailed description of these methods is discussed in the following sections. These alternative strategies were then compared with the traditional ‘single’ and ‘master’ curve models. To compare all models, data generated from two qPCR assays targeting ribosomal RNA genes from Escherichia coli and Clostridia spp. are considered in this study. Duplicate CT measurements were measured at five known DNA concentrations to generate a standard calibration curve. This experiment was repeated 13 times for each assay resulting in data for 13 independently generated standard curves. A fourteenth run was then performed including six replicate CT measurements at the same five known DNA concentrations. In addition, a series of unknown samples were tested in the last run to compare the estimation of DNA target concentrations using the ‘single’, ‘master’, ‘pooled’, and ‘mixed’ models. In addition, the incorporation of uncertainty introduced from replicate CT measurements and calibration standards preparation was also explored. A Bayesian statistical modeling technique integrating a Monte Carlo Markov Chain simulation method was used to generate ‘single’, ‘pooled’, ‘master’, and ‘mixed’ calibration curves for both assays. The mean and the percentiles of the posterior distribution are used as point and interval estimates of unknown parameters such as intercepts, slopes and DNA concentrations. The software WinBUGS (http://www.mrc-bsu.cam.ac.uk/bugs) was used to perform all simulations and to generate the posterior distributions of all the unknown parameters of interest (Lunn et al., 2000).

Section snippets

Sample collection and preparation

Secondary predisinfected wastewater samples (unknowns) were collected from several publicly owned treatment works. Ten milliliters of each sample were filtered through a 0.4 μm pore size (47 mm in diameter) polycarbonate membrane filter (GE Osmonics, Minnetonka, MA). Filters were placed into a 2 ml screw cap tube containing 0.3 g of glass beads and extracted with the addition of 600 μl AE buffer (Qiagen,Valencia, CA) containing 0.2 μg ml−1 salmon DNA as described elsewhere (Haugland et al., 2005

Comparison of calibration curve models

‘Single’, ‘pooled’, ‘master’, and ‘mixed’ models were compared with data generated from 14 repeated experiments of Cperf and uidA qPCR assays. For the first 13 experiments, duplicate CT measurements were generated at five DNA standard concentrations. On the fourteenth run, six CT measurements for each DNA standard concentration along with six unknown samples were analyzed to compare the four calibration curve strategies. Because the unknown samples were only analyzed in the last run,

Selecting the best model for a qPCR study

Results of our study suggest that the number of instrument runs needed to analyze all unknown test samples in a particular study should dictate the calibration curve model used. Typically in small-scale qPCR studies (one instrument run), calibration curve DNA standards and unknown samples will be analyzed in the same run (Applied Biosystems, 2006, Rasmussen, 2001). In this case, ‘master’, ‘pooled’, or ‘mixed’ calibration curve approaches cannot be generated. However for medium-scale studies,

Conclusions

Selection of the optimal calibration model and the generation of associated calibration curve data for a particular study will depend on the number of instrument runs needed to complete the study and the level of precision required to address the respective project research goals. Although these trends were demonstrated with data from only two qPCR assays, the ‘pooled’ model should be applicable across a wide range of qPCR chemistries and genetic markers. The ‘mixed’ model should also be robust

Acknowledgements

The U.S. Environmental Protection Agency, through its Office of Research and Development, funded and managed, or partially funded and collaborated in, the research described herein. It has been subjected to the Agency’s peer and administrative review and has been approved for external publication. Any opinions expressed in this paper are those of the author(s) and do not necessarily reflect the official positions and policies of the U.S. EPA. Any mention of trade names or commercial products

References (36)

  • A. Bell et al.

    Factors influencing the persistence of fecal Bacteroides in stream water

    Journal of Environmental Quality

    (2009)
  • J.M. Caldwell et al.

    Mitochondrial multiplex real-time PCR as a source tracking method in fecal-contaminated effluents

    Environmental Science and Technology

    (2007)
  • E.C. Chern et al.

    Comparison fecal indicator bacteria densities in marine recreational waters by qPCR

    Water Quality, Exposure and Health

    (2009)
  • L.K. Dick et al.

    Microplate subtractive hybridization to enrich for Bacteroidales genetic markers for fecal source identification

    Applied and Environmental Microbiology

    (2005)
  • A.W. Ibekwe et al.

    Multiplex fluoregenic real-time PCR for detection and quantification of Escherichia coli O157:H7 in dairy wastewater wetlands

    Applied and Environmental Microbiology

    (2002)
  • M. Kirs et al.

    Multiplex quantitative real-time reverse transcriptase PCR for F+-specific RNA coliphages: a method for use in microbial source tracking

    Applied and Environmental Microbiology

    (2007)
  • A. Layton et al.

    Development of Bacteroides 16S rRNA gene TaqMan-based real-time PCR assays for estimation of total, human, and bovine fecal pollution in water

    Applied and Environmental Microbiology

    (2006)
  • M.D. Leach et al.

    A discrete, stochastic model and correction method for bacterial source tracking

    Environmental Science and Technology

    (2008)
  • Cited by (80)

    View all citing articles on Scopus
    View full text