Abstract
Objectives To understand between-hospital variation in thrombolysis use among patients in England and Wales who arrive at hospital within 4 hours of stroke onset.
Design Machine learning was applied to the Sentinel Stroke National Audit Programme (SSNAP) data set, to learn which patients in each hospital would likely receive thrombolysis.
Setting All hospitals (n=132) providing emergency stroke care in England and Wales. Thrombolysis use in patients arriving within 4 hours of known or estimated stroke onset ranged from 7% to 49% between hospitals.
Participants 88,928 stroke patients recorded in the national stroke audit who arrived at hospital within 4 hours of stroke onset, from 2016 to 2018.
Intervention Extreme Gradient Boosting (XGBoost) machine learning models, coupled with a SHAP model for explainability.
Main Outcome Measures Shapley (SHAP) values, providing estimates of how patient features, and hospital identity, influence the odds of receiving thrombolysis.
Results The XGBoost/SHAP model revealed that the odds of receiving thrombolysis reduced 9 fold over the first 120 minutes of arrival-to-scan time, varied 30 fold depending on stroke severity, reduced 3 fold with estimated rather than precise stroke onset time, fell 6 fold with increasing pre-stroke disability, fell 4 fold with onset during sleep, fell 5 fold with use of anticoagulants, fell 2 fold between 80 and 110 years of age, reduced 3 fold between 120 and 240 minutes of onset-to-arrival time, and varied 13 fold between hospitals. The hospital attended explained 56% of the variance in between-hospital thrombolysis use, adding in other hospital processes explained 74%, the patient population alone explained 36%, and the combined information from both patient population and hospital processes explained 95% of the variance in between-hospital thrombolysis use. Patient SHAP values expose how suitable a patient is considered for thrombolysis. Hospital SHAP values expose the threshold at which patients are likely to receive thrombolysis.
Conclusions Using explainable machine learning, we have identified that the majority of the between-hospital variation in thrombolysis use in England and Wales, for patients arriving with time to thrombolyse, may be explained by differences in in-hospital processes and differences in attitudes to judging suitability for thrombolysis.
Competing Interest Statement
The authors have declared no competing interest.
Clinical Protocols
https://github.com/samuel-book/samuel_shap_paper_1
Funding Statement
This report is independent research funded by the National Institute for Health Research Applied Research Collaboration South West Peninsula and by the National Institute for Health Research Health and Social Care Delivery Research (HSDR) Programme [NIHR134326]. The views expressed in this publication are those of the authors and not necessarily those of the National Institute for Health Research or the Department of Health and Social Care.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study used national clinical audit data, collected by the Sentinel Stroke National Audit. The NHS Health Research Authority decision tool (http://www.hra-decisiontools.org.uk/research/) was used to confirm that ethical approval was not required to access the data. No identifiable patient or hospital information were provided in the data, and anonymised hospital names were provided. Governance of the data and access to the data was authorised by the Healthcare Quality Improvement Partnership (HQIP, reference HQIP303).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Data cannot be not available.