ABSTRACT
Background Computerized cognitive training (CCT) is a broad category of drill-and-practice interventions aims to maintain cognitive performance in older adults. Despite a supportive evidence base for general efficacy, it is unclear what types of CCT are most likely to be beneficial and what intervention design factors are essential for clinical implementation.
Methods We searched MEDLINE, Embase, and PsycINFO to August 2019 for randomized controlled trials (RCTs) of any type of CCT in cognitively healthy older adults. Risk of bias within studies was assessed using the Cochrane Risk of Bias 2 tool. The primary outcome was change in overall cognitive performance between CCT and control groups. Secondary outcomes were individual cognitive domains. A series of meta-regressions were performed to estimates associations between key design factors and overall efficacy using robust variance estimation models. Network meta-analysis was used to compare the main approaches to CCT against passive or common active control conditions.
Results Ninety RCTs encompassing 7219 participants across 117 comparisons were included. The overall cognitive effect size across all trials was small (g=0.18, 95% CI 0.14 to 0.23) with considerable heterogeneity (τ2=0.074, 95% prediction interval −0.36 to 0.73), robust to small-study effect or risk of bias. Effect sizes for individual cognitive domains were small, heterogeneous and statistically significant apart from fluid intelligence and visual processing. Meta-regressions revealed significantly larger effect sizes in trials using supervised training or up to three times per week. Multidomain training was the most efficacious CCT approach against any type of control, with greater benefits in a subset of supervised training studies.
Conclusions The efficacy of CCT varies substantially across designs, independent of the type of control. Multidomain supervised CCT appears to be the most efficacious approach, and should be developed to accommodate for individual needs and remote delivery settings. Future research should focus on identifying the intervention components and regimens that could attenuate aging-related cognitive decline.
INTRODUCTION
While cognitive decline is a highly common aspect of normal aging, interventions that can support cognitive function in older adults may have far-reaching health and societal implications, including delaying or preventing insidious progress towards mild cognitive impairment and dementia.1 In an evidence commissioned by the National Institutes of Aging, the National Academy of Medicine2 defined cognitive training as one of the three highest priority areas for prevention research, along with physical activity and blood pressure management. The World Health Organization guidelines for prevention of cognitive decline and dementia3 was similarly supportive of cognitive training, albeit based on low-quality evidence. Yet these conclusions were drawn based on an array of cognitive training interventions compared to various control conditions, leaving no guidance on how their recommendations might be implemented.
Computerized cognitive training (CCT) is a highly common cognitive training approach, based on repeated exercise repeated and controlled practice on exercises that target specific cognitive processes. CCT can be adapted to individual needs, is inherently safe and can be delivered inexpensively at scale in various clinical and community settings. About a dozen meta-analyses have investigated the efficacy of CCT in healthy older adults, generally reporting benefits for overall cognition.4 The largest to date, encompassing 51 randomized controlled trials (RCTs),5 found that efficacy could be moderated by delivery settings and session frequency but did not find differences across types of CCT and control conditions. Regardless, legitimate concerns regarding the overall quality of evidence and variability of methods in the field as well as misleading marketing practices by the “brain training” industry have driven skepticism towards CCT.6 Lack of clarity regarding which CCT approaches might be beneficial are therefore a clear impediment towards translating the recommendations into practice. Thus, we aimed to update and extend the findings of our previous systematic review of the field,4 with a particular focus on comparing the main CCT methods to the most common control conditions.
METHODS
We followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement7,8 and prospectively registered the protocol PROSPERO (CRD42018114891). Eligibility criteria and search strategy follow our previous systematic review of the same topic.5
Eligibility criteria
We considered randomized trials comparing change from baseline to post-training in one or more cognitive measure between CCT and control conditions in cognitively healthy older adults. CCT was defined as ≥4 h of practice on standardized computerized tasks or video games with clear cognitive rationale, administered on personal computers, mobile devices, or gaming consoles. Eligible controls included wait lists, alternative cognitive activities (e.g., psychoeducation) or sham conditions (e.g., low-level practice). Combinations of CCT with other interventions (e.g., physical exercise) were included if controls received the same adjacent intervention. When combined interventions were compared to passive control, trials were included if CCT comprised at least 50% of intervention time. Outcome measures that closely resembled one of more of the trained tasks were excluded.
Information sources and study selection
We searched MEDLINE, Embase, and PsycINFO using the search terms “cognitive training” OR “brain training” OR “memory training” OR “attention training” OR “reasoning training” OR “computerized training” OR “computer training” OR “video game” OR “computer game”. No search or language limits were applied. The first search was done from inception July 2014.5 Search updates were applied on November 2015, February 2018 and August 2019. In each update, two or more independent reviewers performed abstract screening and assessment of full-text articles against the inclusion criteria. A senior reviewer [AL, HMG or GP] was responsible for consolidation of eligibility assessments and resolution of disagreements among reviewers. The final set of included studies was reviewed and approved by AL.
Data items and coding
Since CCT studies typically report multiple outcome measures, all eligible measures were collected. Efficacy data were collected as mean and standard deviation (SD) for each group at each time point, or assessed using measures of change (e.g., pre-post mean and SD of change within groups). We contacted authors when reports provided insufficient data to calculate an effect size or when data for certain outcome measures were not reported. In multi-arm studies, all eligible arms were included (for a list of included arms from each study, see Table 1). Definitions of contrasts for the NMA occasionally differed from the pairwise meta-analyses, especially in multi-arm trials to reflect all available comparisons. Coding CCT and active control conditions into specific types was done based on the content of the intervention. Coding of outcome measures into specific cognitive domains was done based on the Cattell-Horn-Carroll-Miyake framework.9
Risk of bias within studies
We used the 2019 Cochrane Risk of Bias 2 tool10 (RoB2) to assess risk of bias across five domains (randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome, selection of the reported result). In addition, each study received an overall RoB assessment of high, low or some concerns. In contrast to the original RoB2 macros, studies that did not report assessor blinding or intention-to-treat results were coded as high risk of bias regardless of other RoB2 items.
Statistical analysis
Analyses were conducted using the R packages robumeta,11 clubSandwich12 and netmeta.13 The primary outcome was overall cognition, defined as a composite of all eligible outcomes reported in each trial.5 Secondary outcomes were individual cognitive domains. Between-group differences in each outcome measure were converted to standardized mean differences and calculated as Hedges’ g with 95% CI. Pairwise analyses were performed using robust variation estimation14 (RVE) with robumeta, based on correlational dependence model with r=0.8. Heterogeneity across studies was quantified using τ2 and expressed as a proportion of overall observed variance using the I2 statistic.15 Prediction intervals were calculated to assess the dispersion of true effects across studies.16 RVE meta-regressions based on prespecified categorical moderators were performed using robumeta and formally tested for between-group differences based on F-statistic using clubSandwich. Small-study effect for the primary outcome was investigated by visually inspecting funnel plots of effect size vs standard error and formally tested using the Egger’s test as a meta-regression in RVE.17,18 Second, random-effects network meta-analysis of the primary and secondary outcomes was performed using a frequentist framework using netmeta. Network geometry was summarized in a network graph and league tables were created to display the relative effect sizes of all available comparisons. Ranking of treatments were estimated using P-scores, representing the extent of certainty that an intervention is more effective than another intervention19. Higher P-scores represent higher likelihood of a certain intervention to be the more effective. To examine the transitivity assumptions, we created a table summarizing potential effect modifiers (design characteristics and risk of bias) to explore whether these were similarly distributed across the different comparisons. Sensitivity analysis of the primary outcome were conducted for subsets of supervised and home-based training studies.
RESULTS
After accounting for duplicates within and across searches, we screened 14,361 unique titles, of which 762 full-text articles were assessed against the inclusion criteria, resulting in 90 eligible RCTs encompassing 7219 participants (Fig 1). Fourteen RCTs were identified from manual searches and four potentially eligible RCTs were excluded because the reports did not provide sufficient data for analysis and authors did not provide data following our requests.
Characteristics of included studies
Key study characteristics are reported in Table 1. The 90 RCTs included 117 eligible CCT and control arms. The most common type of CCT was multidomain training (n=36 RCTs), followed by working memory training (n=21) and attention/dual task CCT (n=10). Fifty-nine trials (66%) compared CCT to at least one active control condition, of which 11 included an additional passive control group, and 7 trials included more than one CCT arm (Figure 2). Overall risk of bias was assessed as low in 21 trials, high in 36, and 29 had some concerns (Table 1).
Primary outcome: Overall cognition
The pooled overall effect size across all 90 RCTs (1211 effect sizes, median 10 effect sizes per study) was small (g=0.18, 95% CI 0.14 to 0.23) with considerable heterogeneity (τ2=0.074, I2=58%). The 95% prediction interval indicated high variability in overall effect sizes across settings (−0.36 to 0.73). There was no evidence for small-study effect (β=0.35, one-tailed p=0.117, Figure 3). There was no evidence for difference across levels of risk of bias (F2,64.8=0.391, p=0.678).
Meta-regressions
Results of meta-regressions for the primary outcome are provided in Table 2. The pooled effect size was significant larger for supervised vs home-based training (F1,68.5=5.8, p=0.019) and for training 1-3 times per week vs more frequent regimens (F1,71.2=4.9, p=0.029). Session length, treatment duration and total hours of training were not associated with overall cognitive effect size. Compared to studies that used supervised training, home-based training studies tended to provide more frequent (t74.8=8.82, p<0.001) and shorter sessions (t87.5=-3.67, p<0.001), as well as more hours of training (t55.4=2.19, p=0.032). Multiple meta-regressions did not find interactions between delivery mode and any dosing factor.
Secondary outcomes: Individual cognitive domains
Meta-analyses of individual cognitive domains are provided in Table 3. Effect sizes across the six domains were generally small and heterogeneous. There was no evidence for benefit on fluid intelligence, and the pooled estimate for visual processing did not reach statistical significance.
Network meta-analysis: Primary outcome
The 90 RCTs provided 131 pairwise comparisons across 13 CCT or control conditions, resulting in a well-connected network structure (Figure 2). Direct evidence was available for 32 comparisons, most notably multidomain vs no contact (20 RCTs), multidomain vs CS/Education (19 RCTs) and working memory training vs sham (14 RCTs). There was evidence for inconsistency for four comparisons; the direct effect size was larger than the indirect estimate for multidomain vs no contact and speed vs casual computer games, and smaller for speed vs no contact (Table 4).
Across all trials, multidomain training ranked highest for efficacy on overall cognition, with small and statistically significant effect sizes over and above passive control (g=0.21, 95% CI 0.12 to 0.30) and all active control conditions apart from physical exercise (Table 4). Processing speed training was ranked second with similar but slightly smaller estimates. Working memory training was better than all control conditions apart from cognitive stimulation. There was no evidence for cognitive benefit of any active control condition over and above no contact control.
When separating supervised and home-based training studies, only multidomain training was found to be more efficacious than passive control (g=0.30, 95% CI 0.18 to 0.41) and CS/Education (g=0.25, 95% CI 0.14 to 0.36). There was no evidence of benefit for any home-based condition. Finally, an RVE analysis of the 21 RCTs that used supervised multidomain CCT revealed a similar estimate (g=0.30, 95% CI 0.20 to 0.40), with about half the heterogeneity of the full model (τ2=0.037, I2=36%). Of these, 19 provided training up to 3 times per week, resulting in nearly identical estimates (g=0.30, 95% CI 0.19 to 0.41, τ2=0.038, I2=38%).
Secondary outcomes
Network meta-analyses ranking for individual domains are presented in Figure 4. The CCT types ranked highest and reported statistically significant benefits were multidomain (g=0.24, 95% CI 0.10 to 0.38) and working memory training (g=0.22, 95% CI 0.06 to 0.38) for executive functions, speed (g=0.36, 95% CI 0.08 to 0.65) and multidomain (g=0.26, 95% CI 0.05 to 0.46) for long-term memory and retrieval, speed (g=0.61, 95% CI 0.38 to 0.83) and multidomain (g=0.36, 95% CI 0.18 to 0.54) for processing speed, and attention/dual task (g=0.46, 95% CI 0.21 to 0.72) and multidomain (g=0.19, 95% CI 0.05 to 0.32) for general short-term memory. Analyses of fluid intelligence and visual processing did not identify statistically significant benefits for any CCT type.
DISCUSSION
This multivariate and network meta-analysis of 90 RCTs has confirmed the efficacy of CCT and narrowed down the conditions in which CCT can result in cognitive benefits in healthy older adults. Our results suggest that multidomain CCT as the most sensible approach to improving global and domain-specific cognitive performance in this population, and that these effects can be further augmented by implementing supervised settings across up to three weekly sessions. However, trials that used supervised CCT were also more likely to provide less frequent sessions, and therefore it was not possible to examine whether these two factors are independent effect modifiers.
Furthermore, comparisons of common active control conditions (general cognitive stimulation, sham CCT and computer games) do not point to a benefit over and above no-contact (passive) control, suggesting that these are ineffectual not only as stand-alone interventions, but also as a means to control for non-specific (‘placebo’) effects in CCT trials apart from those associated with repeated testing.
The effect size estimates for the primary outcome as well as lack of evidence for small-study effect or association with risk of bias are consistent with our previous meta-analysis,5 and so are the role of supervision and session frequency as effect modifiers. The current meta-analysis included 40 additional RCTs that meet the same stringent eligibility criteria and used more efficient methods to handle dependent effect sizes in meta-regressions. Thus, these findings are likely to be robust and substantially increase the certainty in the RCT evidence for the general efficacy of CCT.
Given the wealth of RCTs, reasonable certainty in the evidence and lack of evidence to support the efficacy of active or passive control conditions, clinical equipoise assumptions in CCT trials are becoming increasingly difficult to justify. That is, the new knowledge gained from clinical trials comparing CCT programs to inert control may not necessarily be substantial enough to withhold potentially effective intervention from older participants.20 At the same time, the opportunity cost of testing relatively basic efficacy hypotheses (‘does CCT work?’) instead of addressing clinical implementation challenges in the field is increasing. Specifically, in order to test the effectiveness of CCT as a means to prevent cognitive decline at scale, research that focuses on maintaining engagement over time, providing remote supervision akin to that of center-based training and personalizing CCT, among other research priorities. These would require larger and longer studies that those typically conducted in the field but allow investigations of novel approaches compared to existing CCT programs.
Our network meta-analysis highlights the importance of multidomain training as the CCT approach most likely to be beneficial. Several meta-analyses of single domain training have shown that training gains tend to be most pronounced within the domains targeted by the program.21,22 Lack of robust evidence for gains in untrained domains is often cited as a limitation of CCT but in fact this is simply the reality of most interventions targeting a specific physiological or mental process. From a clinical implementation perspective, targeted CCT may be clinically relevant for rehabilitation of specific cognitive deficits, such as a recent FDA-approved attention training program for children with attention-deficit/hyperactivity disorder.23 However, given age-related cognitive decline affects multiple domains and typically measured using global cognitive batteries, clinicians and researchers should expect the greater generalizability from multidomain CCT. Several methods for adapting training content to individual cognitive profiles have been developed and are currently undergoing clinical trials, especially within multicomponent dementia prevention trials.24 What methods can increase the efficacy of CCT and, importantly, whether these can slow down cognitive decline remains to be investigated in future trials and more specific meta-analysis methods.
Limitations
To the best of our knowledge, this is the largest systematic review of CCT in older adults to date, and the first to perform a network meta-analysis. Since trials tend to report multiple outcome measures, we used RVE analyses to account for non-independence of effect sizes within studies. Whereas this is an efficient approach that allowed us to detect more heterogeneity and increase the power of meta-regression analyses, a limitation of RVE is that it allows to model dependence due to nesting of multiple outcome measures (correlational model) or groups within studies (hierarchal model), but not both. We used a correlational model as nearly all studies reported multiple outcome measures while only 23 reported more than two eligible arms. Consequently, estimations of weights and heterogeneity did not account for possible differences between subgroups, which may have affected the efficiency of the model specifications.14 A method for combining the two working models has been very recently proposed25 and may counter this problem in future meta-analyses.
Accounting for dependency in a network meta-analysis is even more challenging,26 and we are not aware of viable solutions for generating arm-level composites at this stage. To limit the effect of this problem on our estimates we combined effect sizes within arms into a single estimate using the same correlation-based method27 used in our previous univariate meta-analysis of CCT.5 Despite applying a large coefficient (r=0.8, as in the RVE model) for estimated correlated variance within arms, there was no detectable heterogeneity in the full network, limiting the precision of our estimates. Nevertheless, imprecision of the main analysis was not substantially greater than the RVE estimates, and arguably less prone to bias compared to selection of outcome measures.
Finally, we did not include data beyond post-training assessments for two main reasons. First, including such data will introduce selection bias as the majority of studies implemented relatively short training periods and did not report long-term outcomes, with considerable variation in the number and length of follow-ups. Second, while some residual gains can still be apparent several months and perhaps up to a year after a course of CCT, effects are expected to wane over time without further training.28,29 There is therefore limited value in testing hypotheses related to effect maintenance as the null is the most likely outcome, especially as effects are measured further away from training cessation. Given the ultimate goal of CCT is to delay cognitive decline, implementing and optimizing long-term booster schedules is the key to maintain CCT effects and should be prioritized over more conventional trial designs that focus on modelling the gradual waning of cognitive gains.
CONCLUSIONS
CCT is efficacious for overall and domain-specific cognitive performance in healthy older adults, but effect vary across key intervention design factors. Greater efficacy should be expected from multidomain CCT, applied up to 3 times per week, and provided in supervised settings. Future trials should avoid inert control conditions whenever possible and focus on optimizing training protocol, specifically in home settings. Research synthesis efforts can move away from investigating mere efficacy and focus on detecting more specific intervention components and individual predictors of training response.
Data Availability
All underlying data are available from the corresponding author upon request.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.
- 96.
- 97.
- 98.
- 99.
- 100.
- 101.
- 102.
- 103.
- 104.
- 105.
- 106.
- 107.
- 108.
- 109.
- 110.
- 111.
- 112.
- 113.
- 114.
- 115.
- 116.
- 117.
- 118.