TY - JOUR T1 - Prediction models with survival data: a comparison between machine learning and the Cox proportional hazards model JF - medRxiv DO - 10.1101/2022.03.29.22273112 SP - 2022.03.29.22273112 AU - Audinga-Dea Hazewinkel AU - Hans Gelderblom AU - Marta Fiocco Y1 - 2022/01/01 UR - http://medrxiv.org/content/early/2022/04/02/2022.03.29.22273112.abstract N2 - Recent years have seen increased interest in using machine learning (ML) methods for survival prediction, chiefly using big datasets with mixed datatypes and/or many predictors Model comparisons have frequently been limited to performance measure evaluation, with the chosen measure often suboptimal for assessing survival predictive performance. We investigated ML model performance in an application to osteosarcoma data from the EURAMOS-1 clinical trial (NCT00134030). We compared the performance of survival neural networks (SNN), random survival forests (RSF) and the Cox proportional hazards model. Three performance measures suitable for assessing survival model predictive performance were considered: the C-index, and the time-dependent Brier and Kullback-Leibler scores. Comparisons were also made on predictor importance and patient-specific survival predictions. Additionally, the effect of ML model hyper-parameters on performance was investigated. All three models had comparable performance as assessed by the C-index and Brier and Kullback-Leibler scores, with the Cox model and SNN also comparable in terms of relative predictor importance and patient-specific survival predictions. RSFs showed a tendency for according less importance to predictors with uneven class distributions and predicting clustered survival curves, the latter a result of tuning hyperparameters that influence forest shape through restrictions on terminal node size and tree depth. SNNs were comparatively more sensitive to hyperparameter misspecification, with decreased regularization resulting in inconsistent predicted survival probabilities. We caution against using RSF for predicting patient-specific survival, as standard model tuning practices may result in aggregated predictions, which is not reflected in performance measure values, and recommend performing multiple reruns of SNNs to verify prediction consistency.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by the Dutch Foundation KiKa (Stichting Kinderen Kankervrij), grant 163, through the project Meta-analysis of individual patient data to investigate dose-intensity relation with survival outcome for osteosarcoma patients. AH was supported by Integrative Epidemiology Unit, which receives funding from the UK Medical Research Council and the University of Bristol (MC_UU_00011/3).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Permission to recruit patients to the protocol was provided by the appropriate national and local regulatory and local committees. Written informed content was obtained from all trial participants or legal guardians before beginning protocol therapy. The protocol was approved by the Local Research Ethics Committee (LREC) in Gent (coordinating Ethics Committee for Euramos, Belgium), the Central Committee on Research Involving Human Subjects (CCMO) in the Netherlands, the LREC in Leiden (the Netherlands), and the Multi-Centre Research Ethics Committee (MREC) and LREC in the UK. Link anonymized data were used for the purposes of this study, and the use of the data was in accordance with the consent taken.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAccess to the clinical trial data used to support the findings of this study is restricted. Access to the EURAMOS-1 trial data may be requested from the MRC Clinical Trials Unit (CTU, London; mrcctu.euramos{at}ucl.ac.uk), who will submit the data application to the Coordinating Data Center (CDC, London), with access subject to review by the Trial Management Group (TMG) and the Trial Steering Committee (TSC). ER -