ABSTRACT
BACKGROUND Whether a patient benefits from psychotherapy or not is arguably a complex process and heterogeneous information extracted from process, genetic, demographic, and clinical data could contribute to the prediction of remission status after psychotherapy. This study applied supervised machine learning with such multi-modal baseline data to predict remission in patients with major depressive disorder (MDD) after completed psychotherapy.
METHODS Eight-hundred ninety-four genotyped adult patients (65.5% women, age range 18-75 years) diagnosed with MDD and treated with guided Internet-based Cognitive Behaviour Therapy (ICBT) at the Internet Psychiatry Clinic in Stockholm were included (2008-2016). Predictor variables from multiple domains were available: demographic, clinical, process (e.g. time to complete online questionnaires), and genetic (polygenic risk scores for depression, education and more). The outcome was remission status post ICBT (cut-off ≤10 on MADRS-S). Data were split into train (60%) and validation (40%) sets based on treatment start date. Predictor selection employed human domain knowledge followed by Recursive Feature Elimination. Model derivation was internally validated through repeated cross-validation resampling. The final random forest model was externally validated against a (i) null, (ii) logit, (iii) XGBoost, and (iv) blended meta-ensemble model on the hold-out validation set. Model transparency was explored through partial dependence and Local Interpretable Model-agnostic Explanations (LIME) analysis.
RESULTS Feature selection retained 45 predictors representing all four predictor types. With unseen validation data, the final random forest model proved reasonably accurate at classifying post ICBT remission (Accuracy 0.656 [0.604, 0.705], P vs null model = 0.004; AUC 0.687 [0.631, 0.743]), slightly better vs logit (bootstrap D =1.730, P = 0.084) but not vs XGBoost (D = 0.463, P = 0.643). Transparency analysis showed model usage of all predictor types at both the group and individual patient level.
CONCLUSION A new, multi-modal classifier for predicting MDD remission status after ICBT treatment in routine psychiatric care was derived and empirically validated. The multi-modal approach to predicting remission may inform tailored treatment, and deserves further investigation.
One-sentence Summary Predicting remission of depression in adults after psychotherapy
Competing Interest Statement
Conflict of interest: Professor Mataix-Cols reports receiving personal fees from UpToDate, Inc and Elsevier, both unrelated to the current work.
Funding Statement
JW and CR gratefully acknowledge funding from the Soderstrom-Konig Foundation (SLS-941192, JW), FORTE (2018-00221, CR), and the Swedish Research Council (2018-02487, CR). MB and VK gratefully acknowledge the Stockholm County Council (funding through the Swedish Medical Training and Research Agreement (ALF) (SLL20170708) and infrastructure via the Internet Psychiatry Clinic), the Erling-Persson Family Foundation, and the Swedish Research Council (2016-01961). MB is partially funded by the WASP (Wallenberg Autonomous Systems and Software Program).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The present study complies with the Declaration of Helsinki and was granted ethics approval by the Regional Ethics Board in Stockholm (dnr 2009/1089-31/2 & 2014/1897-31).
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Data includes sensitive personal information and is not available freely by default. Data can be made available to external researcher(s) given reasonable request and following proper procedure, including study ethical approval and adhering to the limits of patient informed consent.