Imagination exercises improve language in younger but not in older children with autism suggesting a strong critical period

Based on a few cases of childhood traumatic aphasia and hemispherectomy, Lenneberg suggested a critical period for first language acquisition. Lenneberg's ideas are still debated today and have never been tested in a large group of children. Autism Spectrum Disorder (ASD) provides such an opportunity as language acquisition is a common problem in ASD. Here, we report data from a three-year-long observational trial of 8,766 children with ASD. We initiated this trial in order to test two hypotheses: 1) voluntary imagination is central to complex language and 2) there exists a strong critical period for voluntary imagination acquisition closing shortly after the age of 5. Accordingly, we developed imagination exercises (verbal and nonverbal), organized them into an application called Mental Imagery Therapy for Autism (MITA) and provided this MITA app gratis to children ages 2 to 12. MITA-experienced children (N=3,540) were matched to the 'treatment-as-usual' participants (TaU, N=5,226). Both younger (2-5 years of age) and older children (5-12 YOA) in MITA and TaU groups improved their language over time, but on an annualized basis, younger MITA children improved their language 3-fold faster than TaU group. There was no difference between MITA and TaU in the older children group. These findings support Lenneberg's critical period hypothesis and indicate that acquisition of voluntary imagination is essential for the full language acquisition. Crucially, our results imply that the underlying plasticity dramatically diminishes after the age of five and therefore even greater therapeutic intervention should be targeting the very first years of a child's life. Clinical Trial Registration ID #NCT02708290.

than children with more severe ASD. There was no difference in improvement between females vs. males in any subscale. Children from non-English-speaking countries (primarily Romancespeaking countries) improved more than children from English-speaking countries in all four subscales.
In this report we apply the same framework to study the MITA voluntary imagination intervention in children ages 2 to 12 years. The data collected over three and a half years show greater language improvement in MITA-experienced children compared to matched 'treatment as usual' controls. Crucially, this difference is observed only in children of 2 to 5 years of age implying that the underlying plasticity dramatically diminishes after the age of five and suggesting a strong critical period for voluntary imagination acquisition.

Results
We first sought to replicate our previous results 26 using the new and significantly larger databases. The analysis of groups within the TaU database confirmed the results reported in 2018. There was no difference between females vs. males in any subscale. Younger children improved more than the older children in the Language, Sociability, and Cognitive awareness subscales (Tables S1, S2). Children with milder ASD improved more than children with more severe ASD in the Language subscale (Tables S5, S6). Children from non-English-speaking countries (primarily Romance-speaking countries) improved more than children from Englishspeaking countries in all four subscales (Tables S9, S10). The analysis of groups within the MITA database was consistent with analysis of groups within the TaU database. There was no difference between females vs. males in any subscale. Younger children improved more than the . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint older children in the Language subscale (Tables S3, S4). Children with milder ASD improved more than children with more severe ASD in the Language subscale (Tables S7, S8). We did not have enough participants from non-English-speaking countries in the MITA database for statistical analysis.
Having demonstrated continuity with respect to group differences within each database, we have applied the same statistical framework to study the difference between the MITA group and the TaU group. Both younger (2)(3)(4)(5) and older (5)(6)(7)(8)(9)(10)(11)(12)    . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint   . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion
In this report, we described data from an observational trial of tablet-based imagination exercises -Mental Imagery Therapy for Autism or MITA 22that included 3,540 children with ASD who worked with MITA a median duration of 520 (IQR: 384-706) days and 5,226 treatment-as-usual (TaU) children. This is the longest-running and the largest study of a caregiver-administered early intervention tool for young children with ASD. Both younger (2-5 years of age) and older children (5-12 YOA) in MITA and TaU groups improved their language over time, but on an annualized basis, younger MITA children improved their language 3-fold faster than the TaU group, Figure 1A. There was no difference between MITA and TaU in the older children group, Figure 1B.
Greater language improvement in the younger but not older MITA children compared to TaU, supports Lenneberg's critical period hypothesis 28 independent of the exercises' exact mode of action. It is possible that imagination exercises directly trained neural networks essential for language 8 . It is also possible that MITA group caregivers were simply more motivated in administering language therapy in general. Additionally, MITA caregivers could learn language therapy techniques, such as "put the cup {on/under/behind} the table," from MITA and then extend those techniques to everyday activities multiplying the effect of exercises many-fold. The exact mechanism of MITA action cannot be identified in an observational trial. Whatever the mechanism, it worked only up to the age of five consistent with the strong critical period.
Arguably, this is the most important conclusion of this study.
One of the reasons for the high number of low-functioning ASD individuals is deep misunderstanding of the critical period. It is not uncommon for parents to brush off their child's . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint language delay until elementary school, at which time, according to the data presented in this manuscript, it may be too late. While clinicians are usually aware of the critical period and normally recommend early intervention at the time of diagnosis, they are often reluctant to emphasize the urgent nature of the problem to the parents due to a complete lack of longitudinal studies comparing a language intervention targeting vulnerable children of various ages 29 . To the best of our knowledge, this is the first study administering the same language intervention to a large group of ASD children over three years. A better understanding of the strong critical period will result in greater effort toward language therapy in very young children and eventually in many more high-functioning productive lives.
Our second hypothesis was that voluntary imagination exercises can improve language ability. In fact, many techniques used by speech language pathologists (SLP) and Applied Behavioral Analysis (ABA) therapists happen to aim at improving voluntary imagination. SLPs commonly refer to these techniques as "combining adjectives, location/orientation, color, and size with nouns," "following directions with increasing complexity," and "building the multiple features/clauses in the sentence" 30 . In ABA jargon, these techniques are known as "visual-visual and auditory-visual conditional discrimination" [31][32][33][34] , "development of multi-cue responsivity" 12 , and "reduction of stimulus overselectivity" 13  Language subscale of ATEC primarily assesses expressive language that depends on voluntary imagination, but can be influenced by many other factors as well. It may take several years before an initially nonverbal or minimally verbal child expresses his/her voluntary imagination . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint skills through language. We conclude that the direct or indirect effect of imagination exercises on language cannot be excluded and in fact could be the most parsimonious explanation for the observed results.

Limitations
The observational design of this study cannot definitively prove causality since unknown confounders may influence the study results. The golden standard of testing a novel clinical intervention is randomized controlled trial (RCT). Prior to conducting the MITA study, we have run the proposal for a therapist-administered RCT of voluntary imagination intervention through many potential funders and collaborators. The proposal has failed to find any traction. We have also considered a caregiver-administered RCT, but decided against it due to high attrition rate.
The only published RCT of caregiver-administered tablet-based therapy for young children with ASD reported an overwhelming drop just after 3 months despite every biweekly telephone calls to encourage app use 35 : during the first 3-month period, participants exercised for a total median time of 1,593 minutes (just under the recommended target of 20 min/day or 1,800 min/3-month period); during the second 3-month period, participants exercised for a total median time of 23 minutes (98.6% drop in app use). In effect, most participants did not receive any intervention after the first 3-month period and therefore were lost for the RCT 35 . As the minimal length of an voluntary imagination intervention RCT is likely to exceed two years 36,37 , participant dropout becomes the major issue. This high attrition rate introduces multiple selection biases that degrade RCT ability to demonstrate causality and essentially makes it no better than an observational trial. In this study, we used propensity score analysis 38 to identify comparable individuals based on age and all four evaluation subscales. For each participant in the MITA group, a match was . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019.12.10.19014159 doi: medRxiv preprint found by choosing the control observation with the closest propensity score. Propensity score matching does not completely address selection bias, but it significantly improves group similarity.
Another disadvantage of low-cost geographically diverse observational trials is their reliance on parent-reported outcome measures. There is an understanding in the psychological community that parents cannot be trusted with an evaluation of their own children. In fact, parents often yield to wishful thinking and overestimate their children's abilities on a single assessment 39 .
However, the pattern of changes can be generated by measuring the score dynamics over multiple assessments. When a single parent completes the same evaluation every three months over multiple years, changes in the score become meaningful. In this trial we used a

Conclusions
Five major conclusions follow up from this study. First, the plasticity essential for acquisition of . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint first language may reduce significantly after the age of five years consistent with Lenneberg's critical period hypothesis 28 . Second, voluntary imagination is an essential component of full language and imagination exercises may be an indispensable component of language therapy. Third, the minimal necessary duration of a clinical trial investigating the effect of imagination exercises could exceed two years. Fourth, some caregivers are capable of administering tabletbased exercises to their children consistently over many years. Fifth, parent-administered parentreported multiyear observational trials can be an attractive low-cost model for studying novel language, behavioral, and dietary interventions. The significant improvement of language observed in the current trial brings hope to many families and inspires us to continue developing imagination exercises and translate MITA to multiple language. The major strength of this study is the large number of long-term participants. The most obvious limitation of the study is that this study observational design cannot definitively prove causality since not all confounders can be adjusted appropriately.

MITA exercises
MITA includes both verbal and nonverbal exercises aiming to develop voluntary imagination ability 3 . MITA verbal activities use higher forms of language, such as noun-adjective combinations, spatial prepositions, recursion, and syntax 4 to train voluntary imagination: e. g., a child can be instructed to put the large red dog behind the orange chair, Figure 2A; or identify the wet animal after the lion was showered by the monkey; or take animals home following an explanation that the lion lives above the monkey and under the cow, Figure 2B. In every activity . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint a child listens to a short story, then works within an immersive interface to generate an answer; correct answers are rewarded. To avoid routinization, all instructions are generated dynamically from individual words. Collectively, verbal activities have over 10 million different instructions, so most instructions will never be heard twice by a child. MITA nonverbal activities aim to provide the same voluntary imagination training visually through implicit instructions 5 . E.g., a child can be presented with two separate images of a train and a window pattern, and a choice of complete trains. The task is to find the correct complete train. The child is encouraged to avoid trial-and-error and integrate separate train parts mentally, . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint thus training voluntary imagination, Figure 3A. Different games use various tasks and visual patterns to keep a child engaged, Figure 3B. Most puzzles are assembled dynamically from multiple pieces so that they never repeat themselves. Collectively, MITA activities are designed to last for approximately 10 years.

MITA group
The MITA app was made available gratis at all major app stores in February 2016. Once the app was downloaded, the caregiver was asked to register and to provide demographic details, including the child's diagnosis and age. Caregivers consented to anonymized data analysis and completed Autism Treatment Evaluation Checklist (ATEC) 27 . The first evaluation was administered approximately one month after the first use of MITA and once 100 puzzles had been completed. The subsequent evaluations were administered at three-month intervals. Parents . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint were asked to complete evaluations independently of a child's actual use of MITA.
From this pool of potential study participants, we selected participants based on the following criteria: 1) Consistency: Participants must have filled out at least three ATEC evaluations and the interval between the first and the last evaluation was six months or longer.

2) Diagnosis:
The subject must have self-reported their diagnosis as ASD.
3) Maximum age: Participants older than twelve years of age were excluded from this study.

4) Minimum age:
Participants who completed their first evaluation before the age of two years were excluded from this study.

5) Minimal ATEC severity:
Participants with initial ATEC scores of less than 20 were excluded.

6) Language:
Participants who indicated their primary language was not English were excluded from the study.
After excluding participants that did not meet these criteria, there were 3,540 total participants, Table 3.

Control group
Independently from MITA, ATEC responses were collected by the Autism Institute from participants voluntarily completing online ATEC evaluations from 2013 to 2019. Little is known about their treatment, but it is unlikely that many of them used MITA. Accordingly, these participants served as a 'treatment as usual' control. Participant selection was described in detail . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint in Ref. 26 . In short, participants were selected based on the following criteria: 1) Completeness: Participants who did not provide a date of birth (DOB) were excluded. As participants' DOB were utilized to determine age, the availability of DOB was necessary.
2) Consistency: Participants had to have completed at least three questionnaires and the interval between the first and the last evaluation was one year or longer.
3) Maximum age: Participants older than twelve years of age were excluded from this study.
As diagnosis was not part of the ATEC questionnaire, some neurotypical participants could be present in the database. To limit the contribution from neurotypical children, we excluded participants that may have represented the neurotypical population by using the Minimum age and the Minimal ATEC severity criteria.

4) Minimum age:
Participants who completed their first evaluation before the age of 2 were excluded from this study, as the diagnosing of ASD in this age group is uncertain and the parents of some of these young cases may have completed the ATEC because they wanted to check whether their normal child had signs of autism.

5) Minimal ATEC severity:
Participants with initial ATEC scores of less than 20 were excluded. 6) Language: Participants who indicated their primary language was not English were excluded from the study.
After excluding participants that did not meet these criteria, there were 5,226 total participants. . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint items and scores range from 0 to 75 points. The scores from each subscale are combined in order to calculate a Total Score, which ranges from 0 to 179 points. A lower score indicates lower severity of ASD symptoms and a higher score correlates with more severe symptoms of ASD.

Statistical analysis
The framework for evaluation of ATEC score changes over time was explained in detail in Ref.
. CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint interval into 3-month periods. All evaluations were mapped into 3-month-long bins with the first evaluation placed in the first bin. When more than one evaluation was completed within a bin, their results were averaged to calculate a single number representing this 3-month interval. It was then hypothesized that there was a three-way interaction between an age group, Visit, and treatment. Statistically, this hypothesis was modeled by applying the Linear Model with repeated measures, where a three-way interaction term was introduced to test the hypothesis. Least squares means (LS Means) and LS Means differences were calculated for all ATEC subscales (Language, Sociability, Cognitive awareness, and Health) at all visits. Participants in the MITA group were matched to those in TaU group using propensity score analysis 38 based on age and all four ATEC subscales at baseline.

Informed Consent
Caregivers have consented to anonymized data analysis and publication of the results.

Compliance with Ethical Standards
Using the Department of Health and Human Services regulations found at 45 CFR 46.101(b)(1), it was determined that this research project is exempt from IRB oversight.

Data Availability
. CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2019. 12.10.19014159 doi: medRxiv preprint De-identified raw data from this manuscript are available from the corresponding author upon reasonable request.

Code availability statement
Code is available from the corresponding author upon reasonable request.
. CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 18. Basser, L. S. Hemiplegia of early onset and the faculty of speech with special reference to . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. author/funder, who has granted medRxiv a license to display the preprint in perpetuity.