Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Advancing Cardiovascular Disease Diagnosis with an Interpretable and Responsible AI Framework

View ORCID ProfileKazi Sakib Hasan, Irfan Sadi Dhrubo
doi: https://doi.org/10.1101/2025.06.17.25329798
Kazi Sakib Hasan
1School of Data and Sciences, BRAC University, Kha 224, Bir Uttam Rafiqul Islam Ave, Dhaka 1212, Bangladesh
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kazi Sakib Hasan
  • For correspondence: simanto.alt{at}gmail.com kazi.sakib.hasan{at}g.bracu.ac.bd
Irfan Sadi Dhrubo
2School of Data and Sciences, BRAC University, Kha 224, Bir Uttam Rafiqul Islam Ave, Dhaka 1212, Bangladesh
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Cardiovascular disease (CVD) remains a leading global health threat, responsible for one in five deaths worldwide. Early detection is critical to mitigate morbidity and mortality, yet traditional diagnostic methods often rely on reactive clinical assessments, missing opportunities for preventive intervention. In this study, a machine learning (ML) ecosystem is developed to enhance CVD diagnosis through two key approaches: (1) an early warning system using non-clinical, self-reported features for accessible risk stratification, and (2) specialized diagnostic models integrating clinical and non-clinical data. The framework leverages advanced ML techniques, including tabular neural networks (TabNet, TabPFN) and ensemble methods (XGBoost, Random Forest), validated on multi-regional datasets. Shapley Additive Explanations (SHAP) analysis identified ECG-related features as dominant predictors of CVD risk, with ST-segment slope (+0.93) and ST depression (+0.63) exhibiting the strongest effects. Counterfactual explanations from the non-clinical model further revealed actionable preventive measures: reducing exercise-induced angina and chest pain severity, alongside increasing exercise heart rate, could shift predictions from diseased to healthy, highlighting the model’s utility for lifestyle interventions. To address ethical and clinical trustworthiness, interpretability tools (SHAP, counterfactuals), fairness mitigation (FairLearn), and uncertainty quantification (Bayesian Neural Networks) are incorporated. Causal inference identified key predictors and their Average Treatment Effects (ATEs) such as exercise-induced angina (ATE: 0.36) and ST slope (ATE: 0.33), informing a hybrid ensemble model that achieved 89% accuracy while reducing dimensionality. The system aligns with FDA Good ML Practices and EU Trustworthy AI guidelines, offering a scalable solution for early detection and equitable diagnosis.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study did not receive any funding

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The datasets are available in: https://archive.ics.uci.edu/dataset/45/heart+disease

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • irfan.sadi.dhrubo{at}g.bracu.ac.bd

  • The manuscript has undergone significant structural and narrative changes between the oldest version and the new version. First, the title has been simplified. The old title listed specific components like Robust ML Ecosystem and Causal Inference, whereas the new title is more concise, focusing on an Interpretable and Responsible AI Framework. Visually, the new paper adds two critical diagrams that were missing in the old version. Figure 1 now illustrates the complete methodology workflow, distinguishing between the Clinical and Non Clinical streams. Figure 2 has been added to visualize the data acquisition and merging process from the five different sources. The most substantial content change is the treatment of synthetic data. The old version included a specific section titled Tabular Models Drawback On Synthetic Data (Section 4.4.1), which detailed how models failed when trained on Gaussian Copula augmented data. The new version removes this irrelevant section entirely. Instead, it frames synthetic data generation (using SMOTE and CTGAN) as a successful tool for class balancing and bias mitigation, rather than focusing on the limitations of high volume augmentation. The experimental design presentation has been restructured. The new paper organizes results into Experiment A (Generalization Test) and Experiment B (Final Modeling). This distinction is summarized in the new Table 2, which compares the performance of models trained on Cleveland data versus those trained on the multi region dataset. This specific table was absent in the old version. Finally, the text regarding fairness has been refined. While the old version discussed fairness metrics, the new version integrates these findings more cohesively into the results section, linking them directly to the bias mitigation strategies employed in the final model development.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted December 19, 2025.
Download PDF
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Advancing Cardiovascular Disease Diagnosis with an Interpretable and Responsible AI Framework
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Advancing Cardiovascular Disease Diagnosis with an Interpretable and Responsible AI Framework
Kazi Sakib Hasan, Irfan Sadi Dhrubo
medRxiv 2025.06.17.25329798; doi: https://doi.org/10.1101/2025.06.17.25329798
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Advancing Cardiovascular Disease Diagnosis with an Interpretable and Responsible AI Framework
Kazi Sakib Hasan, Irfan Sadi Dhrubo
medRxiv 2025.06.17.25329798; doi: https://doi.org/10.1101/2025.06.17.25329798

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4482)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (614)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15277)
  • Forensic Medicine (31)
  • Gastroenterology (1133)
  • Genetic and Genomic Medicine (6644)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4603)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1623)
  • Hematology (544)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15960)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (674)
  • Neurology (6695)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3369)
  • Ophthalmology (988)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (669)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5494)
  • Public and Global Health (9285)
  • Radiology and Imaging (2223)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1201)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (535)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)