A Phase-2 RCT of a Voice-based AI Coach for Depression and Anxiety

Thomas Kannampallil; Olusola A. Ajilore; Joshua M. Smyth; Amruta Barve; Corina R. Ronneberg; Nan Lv; Vikas Kumar; Claudia Garcia; Gbenga Aborisade; Nancy E. Wittels; Zhengxin Tang; Bharathi Chinnakotla; Lan Xiao; Jun Ma

doi:10.64898/2025.12.22.25342792

Abstract

Background Clinical evidence regarding artificial intelligence (AI) mental health interventions remains limited. This phase 2 trial investigated the mechanisms and efficacy of a rule-based AI coach, Lumen, delivering problem-solving therapy (PST) via voice for adults with clinically significant symptoms of depression and/or anxiety.

Methods Participants were randomized to Lumen (n=100), human-coached PST (n=50) or waitlist control (n=50) for 18 weeks. PST was delivered by Lumen on Amazon’s Alexa platform via voice or a human coach via videoconferencing in 4-weekly and then 4-biweekly sessions. Change in activation of the right dorsolateral prefrontal cortex (dlPFC) for cognitive control using functional neuroimaging was the primary mechanistic target measure. Patient-reported measures included changes in behavior associated with cognitive control (Social Problem-solving Index-Revised Short Form) and in clinical symptoms (Hospital Anxiety and Depression Scale). Statistical analyses used t-tests and ordinary least square regression

Findings Participants had a mean age of 36.6 years (SD=11.9), and were 77% women, 25% Black, 29% Latino, and 21% Asian. At 18 weeks, change from baseline in right dlPFC activity did not differ significantly by treatment arm. Compared with waitlist control, Lumen-coached participants had significantly greater improvements from baseline to 18 weeks in overall problem-solving ability (between-group mean difference=1.04, 95%CI [0.23, 1.84]) and in symptoms of psychological distress (between-group mean difference=-3.56, 95%CI [−5.69, - 1.43]) due to depression and anxiety. Lumen- and human-coached PST did not differ significantly for any of these measures, and improved problem-solving ability mediated reductions in psychological distress for both modalities. One serious adverse event involving hospitalization, unrelated to the study, was detected.

Interpretation A rule-based AI coach delivering PST via voice may improve problem-solving abilities and clinical symptoms among adults with clinically significant depression and anxiety. However, these findings are preliminary; further research is warranted to confirm clinical efficacy and to elucidate neural mechanisms.

Funding R33MH119237

Evidence before this study The evidence base for artificial intelligence (AI)-based conversational agents (or “chatbots”) for depression and anxiety remains limited. Recent systematic reviews and meta-analyses of randomized clinical trials have generally found these interventions to be modestly effective, although the overall certainty of evidence is low due to substantial heterogeneity and methodological limitations across studies. Moreover, most prior trials have evaluated text-based chatbot interactions, whereas voice-based AI approaches capable of supporting spoken exchanges resembling clinician-patient conversational interactions during therapy remain understudied. For this study, we specifically searched PubMed for randomized clinical trials evaluating voice-based AI-based chatbots for depression and anxiety for the period 2014 to May 13, 2026, without language restrictions, using the following search term: (voice OR vocal OR speak) AND (chatbot OR coach OR agent OR “artificial intelligence” OR “AI”) AND (depression OR depressive OR anxiety). This search identified 24 articles, and after initial screening of abstracts, 3 studies were identified, including the phase 1 pilot trial for this study.

Added value of this study To our knowledge, this is the first 3-arm randomized clinical trial evaluating a rule-based AI coach (“Lumen”) delivering problem-solving therapy (PST), a brief cognition-focused psychotherapy, using a commercial voice-based conversational platform. A total of 200 participants with untreated, clinically significant depression and/or anxiety were randomized to Lumen-coached PST, human-coached PST, or waitlist control. Although Lumen did not significantly alter the primary neural target related to cognitive control, secondary measures of patient-reported problem-solving ability and psychological distress improved significantly over 18 weeks, compared with waitlist control. Exploratory noninferiority tests showed that treatment effects of Lumen- and human-coached PST did not differ significantly on these outcomes. Improved problem-solving mediated reduced psychological distress for both modalities, supporting theory-based mechanism of PST.

Implications of the available evidence Consistent with emerging evidence supporting the therapeutic potential of AI interventions for depression and anxiety, the present findings suggest that voice-based, AI-driven PST may benefit adults with clinically significant depression and/or anxiety who are not receiving care. Further adequately powered trials are needed to confirm clinical noninferiority to human-coached PST, determine the durability of treatment effects, and clarify the neural and behavioral mechanisms underlying treatment response.

Competing Interest Statement

Dr. Jun Ma serves as an editor for an Oxford University Press journal, outside of this work. Dr. Olusola A. Ajilore is the co-founder of Keywise AI and serves on the advisory boards of Blueprint Health and Embodied Labs. Dr. Thomas Kannampallil serves as an editor for an Elsevier journal and unpaid member of the research advisory group for Abridge Inc, outside of this work. The other authors report no conflicts of interest.

Clinical Trial

NCT05603923

Funding Statement

This work was supported by the National Institute of Mental Health (NIMH) [grant number R33MH119237].

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

IRB of University of Illinois Chicago gave ethical approval for this work.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

Tables were reorganized to emphasize important mechanistic targets and symptom outcomes; Other outcomes were moved to the supplement. The discussion section was strengthened.

Data Availability

Data used in the preparation of this manuscript will be submitted to the National Institute of Mental Health (NIMH) Data Archive (NDA). NDA is a collaborative informatics system created by the National Institutes of Health to provide a national resource to support and accelerate research in mental health. Those wishing to use this data can contact the corresponding author for the dataset identifier and make a request to the NIMH (visit https://nda.nih.gov/). This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH.