Abstract
Background Accurately predicting short-term mortality is important for optimizing healthcare resource allocation, developing risk-reducing interventions, and improving end-of-life care. Moreover, short-term mortality risk reflects individual frailty and can serve as digital aging marker. Previous studies have focused on specific, high-risk populations. Predicting all-cause mortality in an unselected population incorporating both health and socioeconomic factors has direct public health relevance but requires careful fairness considerations.
Methods We developed a deep learning model to predict 1-year mortality using nationwide longitudinal data from the Finnish population (N = 5.4 million), including >8,000 features and spanning back up to 50 years. We used the area under the receiver operating characteristic curve (AUC) as a primary metric to assess model performance and fairness.
Findings The model achieved an AUC of 0.944 with strong calibration, outperforming a baseline model that only included age and sex (AUC = 0.897). The model generalized well to different causes of death (AUC > 0.800 for 45 out of 50 causes), including COVID-19 which was not present in the training data. The model performed best among young females and worst in older males (AUC = 0.910 vs. AUC = 0.718). Extensive fairness analyses revealed that individuals belonging to multiple disadvantaged groups had the worst model performance, not explained by age and sex differences, reduced healthcare contact, or smaller training set sizes within these groups.
Conclusion A deep learning model based on nationwide longitudinal multi-modal data accurately identified short-term mortality risk holding the potential for developing a population-wide in-silico aging marker. Unfairness in model predictions represents a major challenge to the equitable integration of these approaches in public health interventions.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 101016775, from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant number 945733) and from Academy of Finland fellowship grant N. 323116.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The FinRegistry project has received IRB approval from the National Institute of Health and Welfare (Kokous 7/2019). The FinRegistry project has received the following approvals for data access from the National Institute of Health and Welfare (THL/1776/6.02.00/2019 and subsequent amendments), Digital and Population Data Services Agency (DVV, VRK/5722/2019-2), Finnish Center for Pension (ETK/SUTI 22003) and Statistics Finland (TK-53-1451-19).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Access to FinRegistry data can be obtained by submitting a data permit application for individual-level data for the Finnish social and health data permit authority Findata (https://asiointi.findata.fi/). The application includes information on the purpose of data use; the requested data, including the variables, definitions for the target and control groups, and external datasets to be combined with FinRegistry data; the dates of the data needed; and a data utilization plan. The requests are evaluated on a case-by-case basis. Once approved, the data are sent to a secure computing environment Kapseli and can be accessed within the European Economic Area (EEA) and countries with an adequacy decision from the European Commission.