Abstract
Aims Population datasets are increasingly used to study type 1 or 2 diabetes, and inform clinical practice. However, correctly classifying diabetes type, when insulin treated, in population datasets is challenging. Many different approaches have been proposed, ranging from simple age or BMI cut offs, to complex algorithms, and the optimal approach is unclear. We aimed to compare the performance of approaches for classifying insulin treated diabetes for research studies, evaluated against two independent biological definitions of diabetes type.
Method We compared accuracy of thirteen reported approaches for classifying insulin treated diabetes into type 1 and type 2 diabetes in two population cohorts with diabetes: UK Biobank (UKBB) n=26,399 and DARE n=1,296. Overall accuracy and predictive values for classifying type 1 and 2 diabetes were assessed using: 1) a type 1 diabetes genetic risk score and genetic stratification method (UKBB); 2) C-peptide measured at >3 years diabetes duration (DARE).
Results Accuracy of approaches ranged from 71%-88% in UKBB and 68%-88% in DARE. All approaches were improved by combining with requirement for early insulin treatment (<1 year from diagnosis). When classifying all participants, combining early insulin requirement with a type 1 diabetes probability model incorporating continuous clinical features (diagnosis age and BMI only) consistently achieved high accuracy, (UKBB 87%, DARE 85%). Self-reported diabetes type alone had high accuracy (UKBB 87%, DARE 88%) but was available in just 15% of UKBB participants. For identifying type 1 diabetes with minimal misclassification, using models with high thresholds or young age at diagnosis (<20 years) had the highest performance. An online tool developed from all UKBB findings allows the optimum approach of those tested to be selected based on variable availability and the research aim.
Conclusion Self-reported diagnosis and models combining continuous features with early insulin requirement are the most accurate methods of classifying insulin treated diabetes in research datasets without measured classification biomarkers.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The Diabetes Alliance for Research in England (DARE) study was funded by the Wellcome Trust and supported by the Exeter NIHR Clinical Research Facility.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
1. Cohort name: Diabetes Alliance for Research in England (formerly Exeter Research Alliance for Diabetes) 2. Non-abbreviated, full name of Ethics Committee / Institutional Review Board (IRB) that assessed the ethics for the DARE cohort: Devon & Torbay Research Ethics Committee. REC Ref: 2002/7/118 3. Decision made by ethics oversight body: Approved
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
UK Biobank data are available through a procedure described at http://www.ukBiobank.ac.uk/using-the-resource/. DARE data are available through application to the Peninsula Research Bank https://exetercrfnihr.org/about/exeter-10000-prb/