Abstract
Cardiovascular disease (CVD) remains a leading global health threat, responsible for one in five deaths worldwide. Early detection is critical to mitigate morbidity and mortality, yet traditional diagnostic methods often rely on reactive clinical assessments, missing opportunities for preventive intervention. In this study, a machine learning (ML) ecosystem is developed to enhance CVD diagnosis through two key approaches: (1) an early warning system using non-clinical, self-reported features for accessible risk stratification, and (2) specialized diagnostic models integrating clinical and non-clinical data. The framework leverages advanced ML techniques, including tabular neural networks (TabNet, TabPFN) and ensemble methods (XGBoost, Random Forest), validated on multi-regional datasets. Shapley Additive Explanations (SHAP) analysis identified ECG-related features as dominant predictors of CVD risk, with ST-segment slope (+0.93) and ST depression (+0.63) exhibiting the strongest effects. Counterfactual explanations from the non-clinical model further revealed actionable preventive measures: reducing exercise-induced angina and chest pain severity, alongside increasing exercise heart rate, could shift predictions from diseased to healthy, highlighting the model’s utility for lifestyle interventions. To address ethical and clinical trustworthiness, interpretability tools (SHAP, counterfactuals), fairness mitigation (FairLearn), and uncertainty quantification (Bayesian Neural Networks) are incorporated. Causal inference identified key predictors and their Average Treatment Effects (ATEs) such as exercise-induced angina (ATE: 0.36) and ST slope (ATE: 0.33), informing a hybrid ensemble model that achieved 89% accuracy while reducing dimensionality. The system aligns with FDA Good ML Practices and EU Trustworthy AI guidelines, offering a scalable solution for early detection and equitable diagnosis.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The datasets are available in: https://archive.ics.uci.edu/dataset/45/heart+disease
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
irfan.sadi.dhrubo{at}g.bracu.ac.bd
The manuscript has undergone significant structural and narrative changes between the oldest version and the new version. First, the title has been simplified. The old title listed specific components like Robust ML Ecosystem and Causal Inference, whereas the new title is more concise, focusing on an Interpretable and Responsible AI Framework. Visually, the new paper adds two critical diagrams that were missing in the old version. Figure 1 now illustrates the complete methodology workflow, distinguishing between the Clinical and Non Clinical streams. Figure 2 has been added to visualize the data acquisition and merging process from the five different sources. The most substantial content change is the treatment of synthetic data. The old version included a specific section titled Tabular Models Drawback On Synthetic Data (Section 4.4.1), which detailed how models failed when trained on Gaussian Copula augmented data. The new version removes this irrelevant section entirely. Instead, it frames synthetic data generation (using SMOTE and CTGAN) as a successful tool for class balancing and bias mitigation, rather than focusing on the limitations of high volume augmentation. The experimental design presentation has been restructured. The new paper organizes results into Experiment A (Generalization Test) and Experiment B (Final Modeling). This distinction is summarized in the new Table 2, which compares the performance of models trained on Cleveland data versus those trained on the multi region dataset. This specific table was absent in the old version. Finally, the text regarding fairness has been refined. While the old version discussed fairness metrics, the new version integrates these findings more cohesively into the results section, linking them directly to the bias mitigation strategies employed in the final model development.





