Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

An Electronic Health Record Compatible Model to Predict Personalized Treatment Effects from the Diabetes Prevention Program: A Cross-Evidence Synthesis Approach Using Clinical Trial and Real World Data

View ORCID ProfileDavid M Kent, Jason Nelson, Anastassios Pittas, Francis Colangelo, Carolyn Koenig, David van Klaveren, Elizabeth Ciemins, John Cuddeback
doi: https://doi.org/10.1101/2021.01.06.21249334
David M Kent
1Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David M Kent
  • For correspondence: dkent1@tuftsmedicalcenter.org
Jason Nelson
1Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anastassios Pittas
2Division of Endocrinology, Tufts Medical Center, Boston, MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Francis Colangelo
3Allegheny Health Network, Pittsburgh, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carolyn Koenig
4Mercy Health, St. Louis, MO
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David van Klaveren
1Predictive Analytics and Comparative Effectiveness Center, Tufts Medical Center, Boston, MA
5Department of Public Health, Erasmus MC University Medical Center, Rotterdam, the Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Elizabeth Ciemins
6AMGA (American Medical Group Association), Alexandria, VA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Cuddeback
6AMGA (American Medical Group Association), Alexandria, VA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background An intensive lifestyle modification program or metformin pharmacotherapy reduced the risk of developing diabetes in patients at high risk, but are not widely used in the 88 million American adults with prediabetes.

Objective Develop an electronic health record (EHR)-based risk tool that provides point-of-care estimates of diabetes risk to support targeting interventions to patients most likely to benefit.

Design Cross-design synthesis: risk prediction model developed and validated in large observational database, treatment effect estimates from risk-based reanalysis of clinical trial data.

Setting Outpatient clinics in US.

Patients Risk model development cohort: 1.1 million patients with prediabetes from the OptumLabs Data Warehouse (OLDW); validation cohort: distinct sample of 1.1 million patients in OLDW. Randomized clinical trial cohort: 3081 people from the Diabetes Prevention Program (DPP) study.

Interventions Randomization in the DPP: 1) an intensive program of lifestyle modification; 2) standard lifestyle recommendations plus 850 mg metformin twice daily; or 3) standard lifestyle recommendations plus placebo twice daily.

Results Eleven variables reliably obtainable from the EHR were used to predict diabetes risk. This model validated well in the OLDW (c-statistic = 0.76; observed 3-year diabetes rate was 1.8% in lowest-risk quarter and 19.6% in highest-risk quarter). In the DPP, the hazard ratio for lifestyle modification was constant across all levels of risk (HR = 0.43, 95% CI 0.35 – 0.53); while the HR for metformin was highly risk-dependent (HR HR = 1.1 [95% CI: 0.61 - 2.0] in the lowest-risk quarter vs. HR=0.45 [95% CI: 0.35 0.59] in the highest risk quarter). Fifty-three percent of the benefits of population-wide dissemination of the DPP lifestyle modification, and 76% of the benefits of population-wide metformin therapy can be obtained targeting the highest risk quarter of patients.

Limitations Differences in variable definitions and in missingness across observational and trial settings may introduce estimation error in risk-based treatment effects.

Conclusion An EHR-compatible risk model might support targeted diabetes prevention to more efficiently realize the benefits of the DPP interventions.

Introduction

The Diabetes Prevention Program (DPP) Study showed that either an intensive program of lifestyle modification or pharmacotherapy with metformin substantially reduced the risk of developing type 2 diabetes in patients at high risk, compared to “usual care.”1 The findings have broad implications, as “prediabetes” affects approximately 88 million US adults in the US.2

Strenuous calls to address the epidemic of diabetes with prevention3,4 have been counter-balanced by concerns about the over-medicalization of prediabetes.5 Almost two decades after the publication of the DPP Study, it remains unclear how best to implement these interventions in such an overwhelmingly large, and mostly undiagnosed, population. A 2015 study examining a national sample of over 17,000 working-age adults with prediabetes found that only 3.7% were receiving metformin.6 Similarly, widespread use of the intensive lifestyle intervention remains largely unrealized despite evidence that rigorous diet and physical activity promotion reduces diabetes risk in the community setting.7

Yet, prediabetes is itself a heterogeneous condition. We previously showed that even among patients enrolled in the DPP Study itself, the risk of developing diabetes within 3 years varies widely and is highly skewed.8 Some trial participants were estimated to have a 1–2% risk, others 90%. Unsurprisingly, the degree of benefit from metformin or from the lifestyle intervention was also distributed unevenly.

This prior proof-of-concept work had several limitations. Notably, the risk distribution within the DPP trial participants may differ from that of patients seen in routine practice, particularly since the American Diabetes Association (ADA) has subsequently broadened its definition of prediabetes to include a still more heterogeneous population.9 Further, the application of prediction methods to data routinely collected in the electronic health record (EHR) provides a promising means to overcome some of the major barriers to the use of risk models.10,11 For example, in addition to requiring manual ascertainment of variables, the previously reported DPP-based model required waist circumference and waist-to-hip ratio measurements that are not difficult to ascertain in routine practice. Herein, we describe development of a clinical prediction model using a hybrid approach that makes use of routinely collected EHR data to predict the risk of diabetes onset and clinical trial data to estimate unbiased risk-based effects of preventive interventions.

Methods

Overview

We sought to develop and validate a diabetes risk prediction model using data elements readily available in the EHR for dissemination across healthcare systems as an EHR-embedded tool, to facilitate ease of use. The tool provides clinicians and their patients with an individualized risk of developing diabetes and the estimated benefit of applying a DPP treatment strategy—either an intensive lifestyle program or pharmacotherapy with metformin (the combination of both was not tested in the DPP Study).

Data sources and Participants

The model was developed and validated using EHR data from the OptumLabs Data Warehouse (OLDW). The OptumLabs EHR database is a geographically diverse sample of the US population with longitudinal clinical data on over 33 million lives with at least one clinic visit during the study period. Using a retrospective observational cohort design, we geographically stratified the database by US Census Region into a development cohort of 1.1 million patients (North East, South, and West) and a separate validation cohort of 1.1 million patients (Midwest).

Eligibility criteria included age between 25 and 75 upon an “index” office or clinic encounter (“index visit” defined by CPT/HCPCS codes, see Appendix Table 1) between January 1, 2012, and December 31, 2016, at which time they met lab-based criteria for the diagnosis of prediabetes. Prediabetes was defined by current American Diabetes Association (ADA) criteria, i.e., having no diagnosis of type 1 or type 2 diabetes on the problem list and one of the following within 12 months prior to the visit: hemoglobin A1c between 5.7 and 6.4% inclusive and/or fasting glucose (FG) between 100 and 125 mg/dL inclusive. Since labeling of fasting status may be incomplete, a glucose drawn at the same time as a lipid panel or triglycerides was considered as fasting. We did not use the 2-hour post glucose load criterion as it is rarely used in clinical practice for prediabetes. Patients were excluded if they had random (non-fasting) glucose greater than or equal to 200 mg/dL on two occasions within a 3-month period prior to the index visit. Women with documented pregnancy within 24 months of the index visit were also excluded. To ascertain development of diabetes, patients also had to have some clinical activity 3 years after the index visit. Eligibility criteria are detailed in Appendix Table 1.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1. Cohort Characteristics

The DPP dataset was used to estimate treatment effect for metformin or the intensive lifestyle modification program. The design, rationale, outcomes, and loss to follow-up of the DPP have been described in detail elsewhere1,12. Briefly, inclusion criteria included a body mass index (BMI) of 24 or higher (22 or higher in Asians) and a fasting plasma glucose concentration of 95 to 125 mg/dL inclusive (impaired fasting glucose) and a concentration of 140 to 199 mg/dL inclusive two hours after a 75 g oral glucose load (impaired glucose tolerance). We note these criteria differ from the ADA’s current diagnostic criteria for prediabetes we used for the OLDW model; the ADA definition imposes no BMI requirement.13 The DPP participants were randomized to: 1) standard lifestyle recommendations plus 850 mg of metformin twice daily; 2) an intensive program of lifestyle modification that included 16 lessons with a case manager and set goals of at least a 7 percent weight loss and at least 150 minutes of physical activity per week; or 3) standard lifestyle recommendations plus placebo twice daily. After a median follow-up period of 2.8 (range 1.8–4.6) years, progression to diabetes was reduced by 58% (95% confidence interval, 47% to 66%) in the lifestyle modification arm and 31% (17% to 43%) in the metformin arm, both compared with the placebo arm1. The NIDDK Data Repository, from which we obtained data, includes 3081 of the 3234 DPP participants (95% of full population), as some local institutional review boards declined to participate in data distribution.

Outcome

For the OLDW cohort, the time to event outcome was defined as the time to the first patient encounter after the index visit with documented evidence of type 2 diabetes by any of the following criteria:14 diagnosis codes ICD-9 250.x0 or 250.x2 or ICD-10 E11.xx; pharmacotherapy or procedure for type 2 diabetes (as detailed in Appendix Table 1); A1c greater than 6.4%; fasting glucose (or presumed fasting, as noted above) greater than 125 mg/dL; 2-hour OGTT post-load glucose greater than 199 mg/dL. Lab-based criteria required confirmation by an additional lab in the diabetes range or by another method (i.e., diagnosis or medication). Follow-up time for patients who did not meet the outcome definition was censored at the first occurrence of the last observed encounter or end of study period.

Candidate predictors

A priori risk model predictors were identified by a systematic review conducted by Collins et al.15 We selected the following 11 independent variables that were included in at least 3 prior diabetes risk models and were judged to be easily and reliably obtainable in EHR data: age, gender, race, smoking status, BMI, presence or absence of a diagnosis of hypertension, systolic blood pressure, HDL cholesterol, triglycerides, fasting glucose, hemoglobin A1c (HgbA1c). Four variables included in 3 prior models were not considered based on the difficulty of ascertaining them in EHR data: physical activity, waist circumference, waist-to-hip ratio, and family history of diabetes.

Missing data

Missing data is a common limitation when working with EHR data.16 While multiple imputation may improve estimates of parameter effects under a missing-at-random assumption, it does not provide a practical means to cope with missingness in actual patients for whom a prediction needs to be made. Thus, we used missing indicator variables to capture the predictive effects of missingness under the assumption that future and prior missingness are similarly informative. For each predictor, an additional dichotomous variable indicated the presence of missing values. For continuous variables (e.g., BMI, HgbA1c), the missing value of the original variable was replaced by a fixed constant (the median) prior to model estimation, and the missing indicator variable appropriately adjusted for the “missing variable effect.” For categorical variables (e.g., race, smoking status), an additional level was added to define the missing category.

Model development

We used multivariable Cox proportional hazards regression to estimate the predicted probability of developing type 2 diabetes. We included two a priori interactions, race*BMI and race*HgbA1c, based upon clinical judgment and the literature.17,18 Model performance was assessed for discrimination and calibration. A bootstrap resampling procedure with 500 samples was used to internally validate the model, estimate optimism-corrected discrimination, and assess calibration.

Model validation

Using the equation derived in the development cohort we calculated the predicted probability of developing type 2 diabetes for patients in the validation cohort. Model performance upon external validation was assessed for discrimination using Harrell’s measure of concordance for censored response variable and calibration.19

Estimating risk-specific treatment effects

To estimate the risk-based treatment effect for metformin pharmacotherapy or the DPP lifestyle modification, we performed a risk-based heterogeneity of treatment effect analysis on the DPP.20 The applicability of the OLDW model to the DPP data was anticipated to be limited by: differences between predictor variable definitions and measurement within a trial context vs. EHR data, differences in the pattern of missingness between these contexts (i.e., there was essentially no data missingness in the DPP), differences in patient enrollment in the two settings and differences in outcome definition and ascertainment.21 Thus, we refit the OLDW model to the DPP, using the same variables and interaction terms. Consistent with methodological recommendations22,23, all 3 DPP arms were used, since research has shown that overfitting to a control arm can induce spurious heterogeneity of treatment effects.24-26 The treatment effect was then estimated by incorporating this linear predictor into a Cox proportional hazards model with the following terms: treatment (metformin or DPP lifestyle modification), the linear predictor of risk from the refitted model, and (potentially) an interaction between these to account for important changes in relative risk reduction across different levels of baseline risk. Based on a previous analysis,8 we anticipated a risk-by-treatment interaction with metformin pharmacotherapy and a consistent relative effect with the DPP lifestyle modification, but we examined interactions for both treatment arms. We also performed a sensitivity analysis, examining the risk-by-treatment interactions, stratifying the DPP by the OLDW model without any refitting, and examining the distribution of predicted effects using this model.

Incorporation of Decision Support in Electronic Health Record

In order to facilitate use in clinical decision making, based on patient and provider focus groups and interviews, we implemented the model in two different ways: 1) a hard coded calculation in an Allscripts EHR; 2) a cloud-hosted SMART on FHIR27 app that can be incorporated into any EHR, leveraging interoperability standards recently promulgated by the US office of the National Coordinator of Health Information (ONC).

Role of the Funding Source

This study was funded by a Patient-Centered Outcomes Research Institute (PCORI), who had no role in the conduct of this research or the decision to publish the results.

IRB Approval

This study was reviewed and approved by the Tufts Health Sciences Institutional Review Board prior to accessing the deidentified data from the DPP and OLDW datasets.

Results

Figure 1 shows the development of the derivation and validation OLDW datasets. Approximately 1.1 million people with prediabetes from the Northeast, South and West were included in the derivation cohort, and a similar number from the Midwest were included in the validation cohort. Characteristics of these cohorts are shown in Table 1.

Figure 1.
  • Download figure
  • Open in new tab
Figure 1. CONSORT Diagram for OLDW derivation and validation cohort

Model development and validation: Risk stratification

The coefficients for each of the variable and interaction terms included in the model are shown in Table 2. The optimism-corrected c-statistic on the derivation sample was 0.73. When the model was tested on the validation cohort, the c-statistic was slightly higher, 0.76. Calibration on the validation cohort was very good (Figure 2). Among the 268,959 patients in the lowest-risk quartile, the predicted diabetes rate was 3.1% (95% confidence interval, 3.0 to 3.2%), while the observed rate was 1.8% (1.7 to 1.9%); among the 268,958 patients in highest-risk quartile, the predicted diabetes rate was 19.2% (18.6 to 19.9%), while the observed rate was 19.6% (19.4 to 19.8%).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2. Final model for incident diabetes
Figure 2.
  • Download figure
  • Open in new tab
Figure 2. Calibration Curves

The figure on the left depicts the observed versus the predicted 3-year rate of developing diabetes in the 1.1 million patients in the derivation cohort (the Northeast, South and West regions) divided into equal sized tenths. The figure on the risk depicts the observed versus the predicted 3-year rate of developing diabetes in the 1.1 million patients in the validation cohort (the Midwest).

Calculation of relative treatment effects in the DPP Study

Prior work demonstrated a consistent relative treatment effect across risk groups with the DPP lifestyle modification and an increasing relative effect with progressively higher risk for metformin pharmacotherapy.8 Using the OLDW model refit to the DPP data (Appendix Table 2; c-statistic 0.719), we confirmed the absence of a treatment-by-risk interaction for lifestyle modification (p for interaction = 0.68); thus, we applied a constant relative risk reduction in the prediction model (HR = 0.43; 95% CI: 0.35 – 0.53) to estimate the diabetes outcome with lifestyle modification. We also confirmed the presence of a treatment-by-risk interaction with metformin pharmacotherapy (p for interaction= 0.003; using the continuous risk on the logit scale): low-risk patients had outcomes with metformin that were similar to usual care (in lowest risk quarter, observed HR = 1.1; 95% CI: 0.61 to 2.0), and high-risk patients have outcomes with metformin that were similar to the DPP lifestyle modification (in highest risk quarter, observed HR = 0.45; 95% CI: 0.35 to 0.59). Figure 3 shows observed and predicted benefits across quartiles for the DPP, for both lifestyle and metformin. A look-up table showing the relative risk reduction with metformin for each level of risk is shown in Appendix Table 3, truncated at a low value of 0% (no harm or benefit) and a high value of 60%.

Figure 3.
  • Download figure
  • Open in new tab
Figure 3. Observed and predicted treatment effects in the DPP Study Across Risk Groups

Black dot and bar is observed treatment effect. Red dot and bar is predicted treatment effect.

Figure 3 depicts the observed treatment effects (black dots) in patients in the DPP Study when patients are stratified into quarters based on their predicted risk for the DPP lifestyle modification intervention (left) and for metformin (right). Predicted effects across risk groups are shown in red. The top set of graphs displays relative effects and shows a consistency of effects across risk groups for lifestyle modification but heterogeneous treatment effects for metformin (p=0.003). The bottom graphs show effects on the absolute risk difference scale, which shows increasing benefits for higher risk patients for both interventions.

Distribution of risks and benefits in OLDW

The overall average 3-year predicted risk of developing diabetes for patients in the validation OLDW cohort was 9.0%, 3.9%, and 6.0%, with usual care, the DPP lifestyle diabetes and metformin respectively. Predictions for the median patient in each quartile using the final model are shown in Table 2. For lifestyle modification, 53% of the total preventable cases of diabetes could be prevented by treating the 25% of patients at highest risk; 76% by treating the 50% at highest risk and 91% by treating the 75% at highest risk. For metformin therapy, 73% of the total preventable cases could be prevented by treating the 25% of patients at highest risk; 93% by treating the 50% at highest risk, and 100% by treating the 75% at highest risk.

Sensitivity analysis

Direct application of the OLDW model (not refit) on the DPP showed a moderately diminished discrimination (c-statistic = 0.68). There was no risk-by treatment interaction with lifestyle (p = 0.69). The risk-by-treatment interaction with metformin was qualitatively similar to that with the refit model (p = 0.08), and the distribution of predicted benefits with this model was also similar. For lifestyle modification, 53% of the total cases of preventable diabetes could be prevented by treating the 25% of patients at highest risk; 76% by treating the 50% at highest risk. For metformin therapy, 65% of the total cases of preventable diabetes could be prevented by treating the 25% of patients at highest risk; 86% by treating the 50% at highest risk.

Implementation of the final model

Figure 4 shows the user interface of the SMART app in an EHR. Predictions are generated automatically based on the data available and retrieved from the patient’s record, using appropriate indicators in the model for missingness where necessary.

Figure 4.
  • Download figure
  • Open in new tab
Figure 4. User interface for decision support tool

Discussion

We present an EHR-compatible model to predict diabetes onset, using 11 variables routinely collected in clinical practice. A major strength of the risk model is that it was derived on the OLDW, which reflects people with prediabetes defined by the most commonly used ADA criteria, from heterogeneous EHRs and more than 30 US healthcare systems. The risk model derived in 3 US Census regions performed very well in a geographically distinct cohort. Compatible risk-specific estimates of treatment effect were then obtained directly from the DPP. By prioritizing care based on the risk of diabetes, this “hybrid” model might help optimize the efficiency of diabetes prevention: Treating just the highest-risk half of people with prediabetes would capture 77% of the benefit of population-wide lifestyle modification or 93% of the benefit of population-wide metformin pharmacotherapy. This is important because lifestyle programs are resource-intensive and require a high level of commitment from the patient. Pharmacotherapy is not without adverse effects, and over-treatment should be avoided, especially in low-risk patients who do not appear to benefit.

The issue of how to address prediabetes has grown in importance as broader diabetes screening has been recommended and promoted.13,28 For every patient with diabetes identified, screening identifies 6 patients with prediabetes; health systems are thus confronted with a growing number of patients who have prediabetes, without the capacity to treat everybody, reserving limited resources to improving cardiometabolic control for patients with diabetes. While the ADA has lowered the A1c and FG thresholds to define prediabetes,9,29 some have argued that the value of medicalizing prediabetes and defining an ever-growing proportion of the population as diseased is of dubious value.5 Most patients who are classified as prediabetic do not develop diabetes even in a decade, and risks of developing end organ damage are low for those developing diabetes later in life.30 Risk stratification offers an approach that promises more focused resources specifically on those who are likely to benefit. While our prior research results provided proof-of-concept that risk stratification could support providers and health systems prioritize these patients,8 the present EHR-compatible model is designed to be used at point of care, and it has been incorporated into the EHRs at several locations in the US.

A longstanding concern regarding limitations of randomized clinical trial results is that they might not be applicable when there is non-random selection into the trial and when treatment effects are heterogeneous31. Here, for example, we found that the “real world” at-risk population was at substantially lower overall risk than patients included in the DPP, and that treatment effects were risk-dependent. The lower overall risk in the OLDW cohort is the result of multiple factors, including: 1) different inclusion criteria for the DPP (including a high BMI and elevated 2-hour glucose after a 75-gram glucose load); 2) differences in the distribution of risk variables; 3) different outcome ascertainment, which is substantially more rigorous in the trial setting. Cross-design synthesis has been proposed as a means of addressing the potential problems of external validity of trial evidence by combining the strengths of both designs—observational designs to capture the full range of patients and randomized trials for unbiased treatment effects32,33.

While several different methods for cross-design synthesis have been proposed34,35, all approaches depend on the ability to adjust results based on patient characteristics across designs. A seldom-discussed barrier is that variable definitions and ascertainment can differ considerably between clinical trial data and routinely collected observational data. Our approach was designed to address these barriers in a pragmatic way, by estimating risk-specific treatment effects in the clinical trial using the same set of variables as used in the observational risk model. This approach was driven in part by our novel aim, to predict effects in patients in clinical care, based on individual patient characteristics, rather than estimating average treatment effects in broad target populations.

A related issue that has received limited attention is how to deploy clinical prediction models in an EHR. There is a proliferation of clinical prediction models; use of routinely collected EHR data to automatically generate individual patient predictions is an appealing approach to disseminate these into the clinic. However, most published clinical prediction models are developed on research cohorts or clinical trials. Predictor variables collected in a trial are not consistently and rigorously captured in the EHR. Recent work has highlighted that heterogeneity in predictor measurement across different settings can substantially degrade model performance.21,36 Finally, use of trial or registry data cannot yield a model robust to missing values in the EHR database used for clinical prediction, since the pattern of missingness present across research and EHR environments is expected to differ. The usual approaches addressing potential bias arising from missingness (e.g., multiple imputation) are not designed to cope with missingness in variables used to generate predictions. These issues guided our decision to derive separate models in the EHR and trial setting, using a common set of variables that were well ascertained in both settings.

Limitations

The methods we used for “cross-walking” between the two very different types of data (trial and EHR real world data) potentially introduce estimation error. Ideally, individualized treatment effects would be estimated on databases that combine the advantages of these different data sources: unbiased effect estimates through randomization; meticulous outcome ascertainment; consistency of predictors across derivation and implementation populations, and large, heterogeneous populations. Improving the quality of data collection in routine care and integrating randomized trials into routine care37-39 may narrow the gap between trial and “real-world” data. Despite these limitations, we obtained qualitatively consistent risk-stratified results in the DPP regardless of which risk model was used: consistency of relative treatment effects of lifestyle modification across all levels of risk and heterogeneous relative treatment effects with metformin, with much stronger relative effects in higher-risk patients.

Conclusion

While the number of people in the US who have prediabetes and qualify for diabetes prevention programs could potentially overwhelm health care systems, these patients have substantial variation in their risk of developing diabetes and in their likelihood of benefiting from prevention therapies. Incorporation of a tool into the EHR to support automated risk stratification of patients in routine clinical care—by predicting individualized benefits—can support shared decision-making and prioritize those patients who are most likely to benefit, where capacity might be limited.

Data Availability

The data that support the findings of this study are available from OptumLabs but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of OptumLabs.

Funding/Acknowledgements

This work was supported by a Patient-Centered Outcomes Research Institute (PCORI) contract (DI-1604-35234). David Kent is the guarantor for this manuscript.

Bibliography

  1. 1.↵
    Knowler WC, Barrett-Connor E, Fowler SE, et al. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. The New England journal of medicine. 2002;346(6):393–403.
    OpenUrlCrossRefPubMedWeb of Science
  2. 2.↵
    Centers for Disease Control and Prevention (CDC). Prediabetes - Your Chance to Prevent Type 2 Diabetes. https://www.cdc.gov/diabetes/basics/prediabetes.html. Published 2020. Accessed 16 Oct 2020.
  3. 3.↵
    Chen L, Magliano DJ, Zimmet PZ. The worldwide epidemiology of type 2 diabetes mellitus--present and future perspectives. Nat Rev Endocrinol. 2011;8(4):228–236.
    OpenUrlCrossRefPubMed
  4. 4.↵
    Herman WH, Zimmet P. Type 2 Diabetes: An Epidemic Requiring Global Attention and Urgent Action. Diabetes care. 2012;35(5):943–944.
    OpenUrlFREE Full Text
  5. 5.↵
    Yudkin JS, Montori VM. The epidemic of pre-diabetes: the medicine and the politics. BMJ (Clinical research ed). 2014;349:g4485.
    OpenUrlFREE Full Text
  6. 6.↵
    Moin T, Li J, Duru OK, et al. Metformin Prescription for Insured Adults With Prediabetes From 2010 to 2012: A Retrospective Cohort Study. Annals of internal medicine. 2015;162(8):542–548.
    OpenUrlCrossRefPubMed
  7. 7.↵
    Balk EM, Earley A, Raman D, Avendano EA, Pittas AG, Remington PL. Combined Diet and Physical Activity Promotion Programs to Prevent Type 2 Diabetes Among Persons at Increased Risk: A Systematic Review for the Community Preventive Services Task Force. Ann Intern Med. 2015;163(6):437–451.
    OpenUrlCrossRefPubMed
  8. 8.↵
    Sussman JB, Kent DM, Nelson JP, Hayward RA. Improving diabetes prevention with benefit based tailored treatment: risk based reanalysis of Diabetes Prevention Program. BMJ (Clinical research ed). 2015;350:h454.
    OpenUrlAbstract/FREE Full Text
  9. 9.↵
    American Diabetes Association (ADA). Diagnosis and classification of diabetes mellitus. Diabetes care. 2010;33:s62–69.
    OpenUrlCrossRefPubMed
  10. 10.↵
    Watson J, Hutyra CA, Clancy SM, et al. Overcoming barriers to the adoption and implementation of predictive modeling and machine learning in clinical care: what can we learn from US academic medical centers? JAMIA Open. 2020;3(2):167–172.
    OpenUrl
  11. 11.↵
    Wallace E, Johansen ME. Clinical Prediction Rules: Challenges, Barriers, and Promise. Annals of family medicine. 2018;16(5):390–392.
    OpenUrlFREE Full Text
  12. 12.↵
    Diabetes Prevention Program Research Group. Design and methods for a clinical trial in the prevention of type 2 diabetes. Diabetes Care. 1999;22(4):623–634.
    OpenUrlAbstract
  13. 13.↵
    American Diabetes Association (ADA). Standards of Medical Care in diabetes--2014. Diabetes care. 2014;37 Suppl 1:S14–80.
    OpenUrl
  14. 14.↵
    National Committee for Quality Assurance (NCQA). National Committee for Quality Assurance (NCQA) Healthcare Effectiveness Data and Information Set (HEDIS) Comprehensive Diabetes Care. Washington, D.C. 2015.
  15. 15.↵
    Collins GS, Mallett S, Omar O, Yu LM. Developing Risk Prediction Models for Type 2 Diabetes: A Systematic Review of Methodology and Reporting. BMC medicine. 2011;9(103).
  16. 16.↵
    Wells BJ, Chagin KM, Nowacki AS, Kattan MW. Strategies for Handling Missing Data in Electronic Health Record Derived Data. In: EGEMS (Wash DC). Vol 1. Washington, D.C.: EGEMS; 2013:1035.
  17. 17.↵
    Beck RW, Riddlesworth TD, Ruedy K, et al. Continuous Glucose Monitoring Versus Usual Care in Patients With Type 2 Diabetes Receiving Multiple Daily Insulin Injections: A Randomized Trial Ann Intern Med. 2017;167(6):365–374.
    OpenUrl
  18. 18.↵
    Zhu Y, Sidell MA, Arterburn D, et al. Racial/Ethnic Disparities in the Prevalence of Diabetes and Prediabetes by BMI: Patient Outcomes Research To Advance Learning (PORTAL) Multisite Cohort of Adults in the U.S. Diabetes care. 2019;42(12):2211–2219.
    OpenUrlAbstract/FREE Full Text
  19. 19.↵
    Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. Springer-Verlag New York; 2001.
  20. 20.↵
    Kent DM, Steyerberg EW, van Klaveren D. Personalized evidence based medicine: predictive approaches to heterogeneous treatment effects. BMJ (Clinical research ed). 2018;363:k4245.
    OpenUrlAbstract/FREE Full Text
  21. 21.↵
    Luijken K, Groenwold RHH, Van Calster B, Steyerberg EW, van Smeden M. Impact of Predictor Measurement Heterogeneity Across Settings on the Performance of Prediction Models: A Measurement Error Perspective. Statistics in medicine. 2019;38(18):3444–3459.
    OpenUrl
  22. 22.↵
    Kent DM, Paulus JK, van Klaveren D, et al. The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement. Ann Intern Med. 2020;172(1):35–45.
    OpenUrl
  23. 23.↵
    Kent DM, van Klaveren D, Paulus JK, et al. The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement: Explanation and Elaboration. Ann Intern Med. 2020;172(1):W1–W25.
    OpenUrl
  24. 24.↵
    Abadie A, Chingos M, West M. Endogenous Stratification in Randomized Experiments. Review of Economics and Statistics. 2018;100(4):567–580.
    OpenUrl
  25. 25.
    Burke JF, Hayward RA, Nelson JP, Kent DM. Using internally developed risk models to assess heterogeneity in treatment effects in clinical trials. Circulation Cardiovascular quality and outcomes. 2014;7(1):163–169.
    OpenUrlAbstract/FREE Full Text
  26. 26.↵
    van Klaveren D, Balan TA, Steyerberg EW, Kent DM. Models with interactions overestimated heterogeneity of treatment effects and were prone to treatment mistargeting. Journal of clinical epidemiology. 2019;114:72–83.
    OpenUrl
  27. 27.↵
    Mandel JC, Kreda DA, Mandl KD, Kohane IS, Ramoni RB. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. Journal of the American Medical Informatics Association: JAMIA. 2016;23(5):899–908.
    OpenUrlCrossRefPubMed
  28. 28.↵
    Siu AL. Screening for Abnormal Blood Glucose and Type 2 Diabetes Mellitus: U.S. Preventive Services Task Force Recommendation Statement. Annals of internal medicine. 2015;163(11):861–868.
    OpenUrlCrossRefPubMed
  29. 29.↵
    Genuth S, Alberti KG, Bennett P, et al. Follow-up report on the diagnosis of diabetes mellitus. Diabetes care. 2003;26:3160–3170.
    OpenUrlFREE Full Text
  30. 30.↵
    Vijan S, Sussman JB, Yudkin JS, Hayward RA. Effect of patients’ risks and preferences on health gains with plasma glucose level lowering in type 2 diabetes mellitus. JAMA internal medicine. 2014;174(8):1227–1234.
    OpenUrl
  31. 31.↵
    Longford NT. Selection Bias and Treatment Heterogeneity in Clinical Trials. Statistics in Medicine. 1999;18(12):1467–1474.
    OpenUrlCrossRefPubMedWeb of Science
  32. 32.↵
    Droitcour J, Silberman G, Chelimsky E. Cross-design Synthesis: A New Form of Meta-analysis for Combining Results from Randomized Clinical Trials and Medical-practice Databases. International Jounral of Technology Assessment in Health Care. 1993;9(3):440–449.
    OpenUrl
  33. 33.↵
    Kaizar EE. Estimating treatment effect via simple cross design synthesis. Statistics in Medicine. 2011;30(25):2986–3009.
    OpenUrlCrossRefPubMed
  34. 34.↵
    Cole SR, Stuart EA. Generalizing Evidence From Randomized Clinical Trials to Target Populations. American journal of epidemiology. 2010;172(1):107–115.
    OpenUrlCrossRefPubMedWeb of Science
  35. 35.↵
    Varadhan R, Henderson NC, Weiss CO. Cross-design synthesis for extending the applicability of trial evidence when treatment effect is heterogeneous: Part I. Methodology. Communications in Statistics: Case Studies, Data Analysis and Applications. 2017;2(3-4):112–126.
    OpenUrl
  36. 36.↵
    Luijken K, Wynants L, van Smeden M, et al. Changing Predictor Measurement Procedures Affected the Performance of Prediction Models in Clinical Examples. Journal of clinical epidemiology. 2020;119:7–18.
    OpenUrl
  37. 37.↵
    Vickers AJ, Scardino PT. The clinically-integrated randomized trial: proposed novel method for conducting large trials at low cost. In: Trials. Vol 10.2009:14.
  38. 38.
    Simon KC, Tideman S, Hillman L, et al. Design and Implementation of Pragmatic Clinical Trials Using the Electronic Medical Record and an Adaptive Design. JAMIA open. 2018;1(1):99–106.
    OpenUrl
  39. 39.↵
    van Staa TP, Dyson L, McCann G, et al. The Opportunities and Challenges of Pragmatic Point-Of-Care Randomised Trials Using Routinely Collected Electronic Records: Evaluations of Two Exemplar Trials. Health technology assessment (Winchester, England). 2014;18(43):1–146.
    OpenUrl
Back to top
PreviousNext
Posted January 08, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
An Electronic Health Record Compatible Model to Predict Personalized Treatment Effects from the Diabetes Prevention Program: A Cross-Evidence Synthesis Approach Using Clinical Trial and Real World Data
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
An Electronic Health Record Compatible Model to Predict Personalized Treatment Effects from the Diabetes Prevention Program: A Cross-Evidence Synthesis Approach Using Clinical Trial and Real World Data
David M Kent, Jason Nelson, Anastassios Pittas, Francis Colangelo, Carolyn Koenig, David van Klaveren, Elizabeth Ciemins, John Cuddeback
medRxiv 2021.01.06.21249334; doi: https://doi.org/10.1101/2021.01.06.21249334
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
An Electronic Health Record Compatible Model to Predict Personalized Treatment Effects from the Diabetes Prevention Program: A Cross-Evidence Synthesis Approach Using Clinical Trial and Real World Data
David M Kent, Jason Nelson, Anastassios Pittas, Francis Colangelo, Carolyn Koenig, David van Klaveren, Elizabeth Ciemins, John Cuddeback
medRxiv 2021.01.06.21249334; doi: https://doi.org/10.1101/2021.01.06.21249334

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Endocrinology (including Diabetes Mellitus and Metabolic Disease)
Subject Areas
All Articles
  • Addiction Medicine (76)
  • Allergy and Immunology (202)
  • Anesthesia (55)
  • Cardiovascular Medicine (495)
  • Dentistry and Oral Medicine (91)
  • Dermatology (57)
  • Emergency Medicine (170)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (217)
  • Epidemiology (5749)
  • Forensic Medicine (3)
  • Gastroenterology (221)
  • Genetic and Genomic Medicine (883)
  • Geriatric Medicine (89)
  • Health Economics (233)
  • Health Informatics (777)
  • Health Policy (400)
  • Health Systems and Quality Improvement (256)
  • Hematology (105)
  • HIV/AIDS (187)
  • Infectious Diseases (except HIV/AIDS) (6581)
  • Intensive Care and Critical Care Medicine (397)
  • Medical Education (119)
  • Medical Ethics (28)
  • Nephrology (94)
  • Neurology (859)
  • Nursing (45)
  • Nutrition (143)
  • Obstetrics and Gynecology (166)
  • Occupational and Environmental Health (266)
  • Oncology (521)
  • Ophthalmology (168)
  • Orthopedics (44)
  • Otolaryngology (107)
  • Pain Medicine (48)
  • Palliative Medicine (22)
  • Pathology (150)
  • Pediatrics (257)
  • Pharmacology and Therapeutics (147)
  • Primary Care Research (116)
  • Psychiatry and Clinical Psychology (990)
  • Public and Global Health (2264)
  • Radiology and Imaging (380)
  • Rehabilitation Medicine and Physical Therapy (175)
  • Respiratory Medicine (314)
  • Rheumatology (110)
  • Sexual and Reproductive Health (83)
  • Sports Medicine (83)
  • Surgery (118)
  • Toxicology (25)
  • Transplantation (34)
  • Urology (42)