Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Improved Type 2 Diabetes Risk Stratification in the Qatar Biobank Cohort by Ensemble Learning Classifier Incorporating Multi-Trait, Population-Specific, Polygenic Risk Scores

Ikhlak Ahmed, Mubarak Ziab, Shahrad Taheri, Odette Chagoury, Sura A. Hussain, Jyothi Lakshmi, View ORCID ProfileAjaz A. Bhat, View ORCID ProfileKhalid A. Fakhro, Ammira S. Al-Shabeeb Akil
doi: https://doi.org/10.1101/2023.06.23.23291830
Ikhlak Ahmed
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mubarak Ziab
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shahrad Taheri
2Qatar Metabolic Institute, Hamad Medical Corporation, P.O. Box 3050, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Odette Chagoury
2Qatar Metabolic Institute, Hamad Medical Corporation, P.O. Box 3050, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sura A. Hussain
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jyothi Lakshmi
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ajaz A. Bhat
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ajaz A. Bhat
Khalid A. Fakhro
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
3Laboratory of Genomic Medicine, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
4College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, P.O. Box 34110, Doha, Qatar
5Department of Genetic Medicine, Weill Cornell Medical College, Doha, P.O. Box 24144, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Khalid A. Fakhro
Ammira S. Al-Shabeeb Akil
1Laboratory of Precision Medicine of Diabetes Prevention, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
3Laboratory of Genomic Medicine, Genetic and Metabolic Diseases Program, Sidra Medicine, P.O. Box 26999, Doha, Qatar
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: aakil{at}sidra.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Background Type 2 Diabetes (T2D) is a pervasive chronic disease influenced by a complex interplay of environmental and genetic factors. To enhance T2D risk prediction, leveraging genetic information is essential, with polygenic risk scores (PRS) offering a promising tool for assessing individual genetic risk. Our study focuses on the comparison between multi-trait and single-trait PRS models and demonstrates how the incorporation of multi-trait PRS into risk prediction models can significantly augment T2D risk assessment accuracy and effectiveness.

Methods We conducted genome-wide association studies (GWAS) on 12 distinct T2D-related traits within a cohort of 14,278 individuals, all sequenced under the Qatar Genome Programme (QGP). This in-depth genetic analysis yielded several novel genetic variants associated with T2D, which served as the foundation for constructing multiple weighted PRS models. To assess the cumulative risk from these predictors, we applied machine learning (ML) techniques, which allowed for a thorough risk assessment.

Results Our research identified genetic variations tied to T2D risk and facilitated the construction of ML models integrating PRS predictors for an exhaustive risk evaluation. The top-performing ML model demonstrated a robust performance with an accuracy of 0.8549, AUC of 0.92, AUC-PR of 0.8522, and an F1 score of 0.757, reflecting its strong capacity to differentiate cases from controls. We are currently working on acquiring independent T2D cohorts to validate the efficacy of our final model.

Conclusion Our research underscores the potential of PRS models in identifying individuals within the population who are at elevated risk of developing T2D and its associated complications. The use of multi-trait PRS and ML models for risk prediction could inform early interventions, potentially identifying T2D patients who stand to benefit most based on their individual genetic risk profile. This combined approach signifies a stride forward in the field of precision medicine, potentially enhancing T2D risk prediction, prevention, and management.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This research was funded by the Path Towards Precision Medicine (PPM) of Qatar National Research Fund (QNRF), a subsidiary of Qatar Research Development and Innovation Council (QRDI) and Sidra Medicine Research Division, grants number PPM 03-0311-190017, and SDR1000043, respectively. Author A.S.A is the project lead principal investigator, budget holder, and the recipient of the research funding support from both organizations.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

QBB Cohort Study QF-QBB-RES-ACC-0075 QBB IRB Approval number is: Full Board-2017-QF-QBB-RES-ACC-0075-0023. Deidentified research conducted under the project E-2020-QF-QBB-RES-ACC-0150-0143. QBB IRB Approval number, E-2017-QF-QBB-RES-ACC-0026-000. Sidra IRB MOPH Assurance: IRB-A-Sidra-2019-0020 Sidra IRB MOPH Registration: IRB-Sidra-2020-009 Sidra IRB DHHS Assurance: FWA00022378. Sidra IRB DHHS Registration: IRB00009930. Sidra Medicine, IRB protocol # 1660756, May 30, 2024. Approval category: Expedited.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • ↵* This work is supported by the PPM award (PPM 03-0311-190017) from QNRF to Dr. Ammira Al-Shabeeb Akil

  • No changes except removing one author based on his request due to his involvement in other conflicting project.

Data Availability

Data described in the manuscript, including all relevant raw data, will be freely available to any scientist wishing to use them for non-commercial purposes without breaching participant confidentiality. Requests should be sent directly to the corresponding author.

  • ABBREVIATIONS

    T2D
    Type 2 Diabetes
    PRS
    polygenic risk scores
    GWAS
    genome-wide association studies
    QGP
    Qatar Genome Programme
    ML
    machine learning
    AUC
    Area under the ROC Curve
    SNPs
    nucleotide polymorphisms
    MENA
    Middle East and North Africa
    SVM
    Support Vector Machine
    QBB
    Qatar Biobank cohort
    HMC
    Hamad Medical Corporation
    WGS
    Whole genome sequencing
    HDL
    high-density lipoprotein cholesterol
    LDL
    low-density lipoprotein cholesterol
    TSH
    Thyroid stimulating hormone
    TGL
    Triglycerides
    INT
    Inverse Normal Transformation
    HWE
    Hardy-Weinberg equilibrium
    LMM
    linear mixed-effect model
    FTO
    Fat mass and obesity associated
    MC4R
    Melanocortin 4 Receptor
    TMEM18
    Transmembrane Protein 18
    WHR
    Waist-Hip Ratio
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
    Back to top
    PreviousNext
    Posted September 05, 2023.
    Download PDF
    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Improved Type 2 Diabetes Risk Stratification in the Qatar Biobank Cohort by Ensemble Learning Classifier Incorporating Multi-Trait, Population-Specific, Polygenic Risk Scores
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Improved Type 2 Diabetes Risk Stratification in the Qatar Biobank Cohort by Ensemble Learning Classifier Incorporating Multi-Trait, Population-Specific, Polygenic Risk Scores
    Ikhlak Ahmed, Mubarak Ziab, Shahrad Taheri, Odette Chagoury, Sura A. Hussain, Jyothi Lakshmi, Ajaz A. Bhat, Khalid A. Fakhro, Ammira S. Al-Shabeeb Akil
    medRxiv 2023.06.23.23291830; doi: https://doi.org/10.1101/2023.06.23.23291830
    Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    Improved Type 2 Diabetes Risk Stratification in the Qatar Biobank Cohort by Ensemble Learning Classifier Incorporating Multi-Trait, Population-Specific, Polygenic Risk Scores
    Ikhlak Ahmed, Mubarak Ziab, Shahrad Taheri, Odette Chagoury, Sura A. Hussain, Jyothi Lakshmi, Ajaz A. Bhat, Khalid A. Fakhro, Ammira S. Al-Shabeeb Akil
    medRxiv 2023.06.23.23291830; doi: https://doi.org/10.1101/2023.06.23.23291830

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Genetic and Genomic Medicine
    Subject Areas
    All Articles
    • Addiction Medicine (430)
    • Allergy and Immunology (756)
    • Anesthesia (221)
    • Cardiovascular Medicine (3294)
    • Dentistry and Oral Medicine (364)
    • Dermatology (280)
    • Emergency Medicine (479)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1171)
    • Epidemiology (13381)
    • Forensic Medicine (19)
    • Gastroenterology (899)
    • Genetic and Genomic Medicine (5155)
    • Geriatric Medicine (482)
    • Health Economics (783)
    • Health Informatics (3271)
    • Health Policy (1142)
    • Health Systems and Quality Improvement (1191)
    • Hematology (431)
    • HIV/AIDS (1018)
    • Infectious Diseases (except HIV/AIDS) (14632)
    • Intensive Care and Critical Care Medicine (913)
    • Medical Education (477)
    • Medical Ethics (127)
    • Nephrology (523)
    • Neurology (4927)
    • Nursing (262)
    • Nutrition (730)
    • Obstetrics and Gynecology (883)
    • Occupational and Environmental Health (795)
    • Oncology (2524)
    • Ophthalmology (725)
    • Orthopedics (281)
    • Otolaryngology (347)
    • Pain Medicine (323)
    • Palliative Medicine (90)
    • Pathology (543)
    • Pediatrics (1302)
    • Pharmacology and Therapeutics (550)
    • Primary Care Research (557)
    • Psychiatry and Clinical Psychology (4215)
    • Public and Global Health (7506)
    • Radiology and Imaging (1706)
    • Rehabilitation Medicine and Physical Therapy (1014)
    • Respiratory Medicine (980)
    • Rheumatology (480)
    • Sexual and Reproductive Health (498)
    • Sports Medicine (424)
    • Surgery (548)
    • Toxicology (72)
    • Transplantation (236)
    • Urology (205)