Abstract
Background and aim Mortality risk stratification was vital for targeted intervention. This study aimed at building the prediction model of all-cause mortality among Chinese dwelling elderly with different methods including regression models and machine learning models and to compare the performance of machine learning models with regression model on predicting mortality. Additionally, this study also aimed at ranking the predictors of mortality within different models and comparing the predictive value of different groups of predictors using the model with best performance.
Method I used data from the sub-study of Chinese Longitudinal Healthy Longevity Survey (CLHLS) - Healthy Ageing and Biomarkers Cohort Study (HABCS). The baseline survey of HABCS was conducted in 2008 and covered similar domains that CLHLS has investigated and shared the sampling strategy. The follow-up of HABCS was conducted every 2-3 years till 2018.
The analysis sample included 2,448 participants from HABCS. I used totally 117 predictors to build the prediction model for survival using the HABCS cohort, including 61 questionnaire, 41 biomarker and 15 genetics predictors. Four models were built (XG-Boost, random survival forest [RSF], Cox regression with all variables and Cox-backward). We used C-index and integrated Brier score (Brier score for the two years’ mortality prediction model) to evaluate the performance of those models.
Results The XG-Boost model and RSF model shows slightly better predictive performance than Cox models and Cox-backward models based on the C-index and integrated Brier score in predicting surviving. Age. Activity of daily living and Mini-Mental State Examination score were identified as the top 3 predictors in the XG-Boost and RSF models. Biomarker and questionnaire predictors have a similar predictive value, while genetic predictors have no addictive predictive value when combined with questionnaire or biomarker predictors.
Conclusion In this work, it is shown that machine learning techniques can be a useful tool for both prediction and its performance sightly outperformed the regression model in predicting survival.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
No external funding was received.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The Ethics approval of HABCS study was obtained from the Research Ethics Committees of Peking University and Duke University. All participants or their legal representatives signed written consent forms in the baseline and follow-up surveys.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The data that support the findings of this study are available on request from the corresponding author, Xurui Jin. The data are not publicly available due to their containing information that could compromise the privacy of research participants.