CARE: a novel wearable-derived feature linking circadian amplitude to human cognitive functions

Circadian rhythms play a critical role in regulating physiological and behavioral processes, with amplitude being a key parameter for their characterization. However, accurately quantifying circadian amplitude in natural settings remains a challenge, as traditional melatonin methods require lab settings and are often costly and time-consuming. Wearable devices are a promising alternative as they can collect consecutive 24-h data for multiple days. The most commonly used measure of circadian amplitude from wearable device data, relative amplitude, is subject to the masking effect of behaviors and fails to leverage the rich information in high-dimensional data, as it only uses the sum of activity counts in time windows of pre-specified lengths. Therefore, in this study, we firstly proposed a pipeline to derive a novel feature to characterize circadian amplitude, named circadian activity rhythm energy (CARE), which can well address the above-mentioned challenges by decomposing raw accelerometer time series data, and then we validated the new feature CARE by assessing its correlation with melatonin amplitude (Pearson's r = 0.46, P = 0.007) in a dataset of 33 healthy participants. Secondly, we investigated its association with cognitive functions in two datasets: an adolescent dataset (Chinese SCHEDULE-A, n = 1,703) and an adult dataset (the UK Biobank dataset, n = 92,202), and we found that the CARE was significantly associated with the Global Executive Composite ({beta} = 30.86, P = 0.016) in adolescents, and reasoning ability (OR = 0.01, P < 0.001), short-term memory (OR = 3.42, P < 0.001), and prospective memory (OR = 11.47, P < 0.001) in adults. And finally, we explored the causal relationship using Mendelian randomization analysis in the adult dataset. We identified one genetic locus with 126 SNPs associated with CARE using genome-wide association study (GWAS), of which 109 variants were used as instrumental variables to conduct causal analysis. The results suggested that CARE had a significant causal effect on reasoning ability ({beta} = -59.91, P < 0.0001), short-term memory ({beta} = 7.94, P < 0.0001), and prospective memory ({beta} = 16.85, P < 0.0001). The findings suggested that CARE is an effective wearable-based metric of circadian amplitude with a strong genetic basis and clinical significance, and its adoption can facilitate future circadian studies and potential interventions to improve circadian rhythms and cognitive functions.


Figure 1. Workflow chart of the study.
CARE: circadian activity rhythm energy *We also selected the significant loci that were associated with the relative amplitude. However, only three single nucleotide polymorphisms (SNPs) were found, the number of which was too small to conduct further causal analysis.

114
Decomposing the activity data using singular spectrum analysis Preprocessing the invalid and missing raw activity data Extracting its subsignals with period of ~24 hours to remove the behavioral masking Calculating a novel feature "CARE" to quantify the circadian amplitude Checking the extreme values of the saliva melatonin raw data Illustrating the melatonin secretion curve by linear interpolation and curve fitting using bimodal skewed baseline cosine function Calculating melatonin amplitude by subtraction of the maximum and minimum values of the melatonin secretion curve Determining the correlation matrix among the calculated CARE, relative amplitude, and the melatonin amplitude Checking the extreme values of all analyzed variables in adolescent and adult datasets Testing the distribution of the cognitive function variables in the two datasets Determining the correlation relationship of CARE and multiple cognitive functions using regression models in the two datasets Determining the correlation relationship of the relative amplitude and multiple cognitive functions using regression models in the two datasets

Proposing a pipeline to calculate a novel feature called "CARE"
Identifying the significant loci associated with CARE using a genome-wide association study in the UK Biobank* Estimating the selected SNP heritability using the linkage disequilibrium score regression Infering the causal relationship between CARE and cognitive functions using Mendelian randomization analysis Examining the biological and functional mechanisms underlying CARE by performing tissue enrichment analysis Step one: to validate the novel wearablederived feature, CARE, with melatonin amplitude in the melatonin dataset Step two: to determine the correlated relationship of CARE with multitude cognitive functions in adolescent and adult datasets Step three: to identify the causal relationship between CARE and cognitive functions in adult dataset . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 6, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 estimation of the sub-signals from accelerometer data (details shown in Figure 2). 119 The accelerometer data can be considered as a composition of sub-signals with different frequencies.

120
In order to characterize circadian rhythms with behavioral masking effects removed, we decomposed 121 the data of activity counts and extracted the sub-signals with a period of ~24 hours to represent the 122 endogenous circadian oscillation. Among a list of signal decomposition approaches such as Fourier 123 and wavelet analyses, we chose singular spectrum analysis (SSA, see a detailed description in Figure   124 2c) in our study for its data adaptive property [34,35] where % is the starting time point of x(t) and is the time span of x(t). The unit of x(t) and 132 energy are (activity counts)/minute and (activity count) 2 /minute 2 , respectively.

133
In summary, CARE is calculated using the following equation: corresponding to the 24-hour signal, and ‖ ‖ < ! is Frobenius norm of the raw activity signal. It should 137 be noted that CARE is expressed as a ratio ranging from 0 to 1 and is unitless. Furthermore, to 138 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint effectively capture the dominant temporal scales and variability in the accelerometer data, and to obtain 139 a reliable estimate of the autocovariance matrix feature, we required a minimum input data length of 140 three days to calculate the CARE metric. (a) Actogram to show 7 days of wrist accelerometer activity of a single participant. (b) A visual representation of the raw activity signal and the singular spectrum analysis (SSA) decomposed signals, including the base signal (i.e., the first sub-signal indicating the non-periodic trend with the largest energy among all sub-signals), the 24hour signal (i.e., sum of the second and third sub-signals), and the behavioral noise signal (i.e., sum of the subsignals with periods < 24 hours). We can obtain the raw activity signal by adding these three signals together. (c) The graphical display of SSA algorithm. Specifically, the activity time series X ! of length N could be decomposed using SSA as follows. First, we chose an appropriate window length L such that 2 ≤ ≤ ! " . Then, X ! was transferred into a trajectory matrix with K lagged vectors of X # as given by T, where K = N -L + 1. The trajectory matrix T was decomposed by singular value decomposition. By grouping the eigentriples and averaging the  elements of reconstructed trajectory matrix along anti diagonals, we could get filtered time series represented by Validate CARE with melatonin amplitude 144 We validated the novel feature CARE by examining its association with the melatonin amplitude in a 145 dataset of 33 healthy participants aged 23 -61 years, where accelerometer activity and melatonin were 146 simultaneously collected (Table 1). Among the 33 participants, mean CARE was 0.26 (SD = 0.05; 147 range 0.14 -0.35), mean relative amplitude was 0.86 (SD = 0.07; range 0.67 -0.96), and mean 148 melatonin amplitude was 11.44 pg/ml (SD = 6.81; range 1.02 -28.27). The correlation analysis 149 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint revealed that the melatonin amplitude was only significantly associated with CARE (Pearson's r = 150 0.46, P = 0.007; Figure 3a). On the other hand, we observed no significant association between 151 melatonin amplitude and relative amplitude (Pearson's r = 0.24, P = 0.19; Figure 3) or relative energy 152 of behavioral noise signals (Pearson's r = 0.07, P = 0.68; Figure 3). Moreover, our study found no 153 significant association between age and sex with melatonin amplitude (P > 0.05; Supplementary 154 Table 1). We also found that CARE accounted for 21.16% of the total variance of melatonin amplitude, 155 whereas age and sex accounted for only 4.3%, and 0.03% of the variance, respectively 156 (Supplementary Table 1).

157
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint Scatter plots between CARE, relative amplitude, and relative energy of behavioral noise with melatonin amplitude in 162 the melatonin study. The black lines indicate the linear regression fits, and the shaded areas represent the confidence 163 intervals of the fitted mean values. The correlation coefficients annotated with an asterisk (*) indicate that the 164 correlation is significant at P < 0.05 level, while two asterisks (**) indicate that the correlation is significant at P < 165 0.01 level. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint participants and accelerometers data are presented in Table 1, and other analyzed variables are 170 described in Supplementary Tables 2 and 3. 171 Mean CARE values were 0.10 (0.04) and 0.13 (0.04) for adolescents and adults, respectively. And  Table 2). However, no significant association was  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint the participant started wearing the accelerometer. Linear regression for processing/reaction speed, ordinal logistic 209 regression for reasoning ability and short-term memory scores, and logistic regression for prospective memory were 210 employed. The significance level was set at P < 0.013. 211

Causal effects of CARE on cognitive functions
212 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint To determine the causal relationship between CARE and cognitive functions, we conducted a GWAS 213 study of CARE followed with MR analysis in the adult sample, which contains filtered genetic data 214 for 85,361 people.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Figure 4. Manhattan and QQ plots for CARE in genome-wide association study in the adult sample (n = 85,361).
(a) The Manhattan plot shows association test (-log10 P value on the y-axis against physical autosomal location on the x-axis). The red line represents the genome-wide significant locus (P < 5×10 −8 ). Heritability estimates were calculated using LDSC tool. (b) The QQ plot identifies a slight inflation (λGC = 1.10) in the test statistic.

2) Causal relationship between CARE and cognitive phenotypes 229
To investigate whether CARE has a causal effect on cognitive outcomes, we performed MR analyses 230 on significant correlation pairs, i.e., CARE with reasoning ability, short-term and prospective memory 231 ( Table 3). Of the 126 CARE-related SNPs, a total of 109 variants were used as instrumental variables 232 to conduct the MR analyses. Using the weighted median method, we found that the genetically

3) Single-tissue and cross-tissue transcriptome-wide association analysis 239
Single-tissue enrichment analysis identified 120 unique genes associated with CARE (P < 2.9×10 -6 ) 240 in 44 GTEx tissues (Supplementary Data 2). Among them, APEH was associated with CARE in 18 241 Heritability = 11.4 (0.7)% λ GC = 1.10 a b . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. . These correlation results also emphasized shared genetic 246 architecture between the central nervous system, the metabolic system, and the circadian system.

250
In the present study, we proposed a pipeline to derive a novel wearable-based feature (i.e., CARE) to 251 characterize circadian amplitude and applied it to identify the association between circadian amplitude 252 and cognitive functions in two large-scale datasets with different age groups. We found a significant 253 association between CARE and melatonin amplitude (a reliable measure of circadian amplitude), while 254 relative amplitude (a commonly used amplitude of the activity) was not. Findings that CARE was 255 associated with a multitude of cognitive outcomes in adolescents and adults demonstrated that CARE 256 could be a clinically meaningful circadian feature derived from objective accelerometer records.

257
Furthermore, we identified one genetic locus with 126 SNPs associated with CARE, and provided the 258 first direct evidence of the causal relations between CARE with reasoning and memory abilities in an 259 adult sample.

260
Our study found a moderate correlation between the new feature CARE and melatonin amplitude in 261 general population under natural settings. The use of CARE can effectively eliminate the influence of 262 behavioral noise on the assessment of circadian amplitude using the accelerometer data, as supported 263 by the lack of significant association between behavioral noise signals and melatonin amplitude.

264
Notably, we calculated CARE values using accelerometer activity data of at least 3 days, making it a 265 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 summary statistic across multiple days for each individual. This computation approach enhances the 266 stability of CARE values and reduces the intra-subject variability. Nonetheless, it is important to note 267 that the CARE values may be impacted by data obtained from various accelerometer devices, which 268 could potentially affect their comparability. Furthermore, the range of CARE values observed in the 269 melatonin dataset is relatively narrow, which is likely due to the limited range of observed melatonin 270 amplitude, as the maximum melatonin amplitude can be up to 84 pg/ml in healthy adults [56]. Therefore, 271 it would be beneficial to investigate whether a broader range of melatonin amplitude leads to a wider 272 range of CARE values in the future research.

273
The findings that age and sex were not significantly associated with melatonin amplitude may 274 contradict with previous research that melatonin levels typically decrease with age starting from middle 275 age and that women tend to have higher melatonin amplitude than men [57][58][59]. This might be due to 276 the limited number of middle-aged and elderly participants (age > 40) and the limited sample size in 277 our dataset. Future research shall recruit larger samples, especially from the elderly adults, to further 278 examine age and sex differences.

279
Our study demonstrated that CARE and relative amplitude may assess different aspects of circadian 280 rhythmicity. Although they are all derived from accelerometer data, CARE was designed to reflect the 281 strength of the core circadian clock located in the SCN, and relative amplitude is more of a metric to 282 quantify the highest/lowest disruption of rest-activity rhythms. The identification of a sufficient 283 number of CARE-associated variants can enable us to perform causal inference through MR analysis, 284 which is not possible with relative amplitude. In addition, the SNP heritability of CARE accounted for 285 a higher proportion of population variance than relative amplitude in the adult sample (UK Biobank 286 cohort). These results further supported CARE as a clinically meaningful feature. Future research is 287 still warranted to systematically compare and assess the relative contribution and relevance of CARE 288 and relative amplitude, particularly in the context of health and disease.

289
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 6, 2023. ; https://doi.org/10.1101/2023.04.06.23288232 doi: medRxiv preprint adult groups. In adults, we observed a correlation between CARE and the problem-solving component 291 of cognitive functions, specifically reasoning and memory domains. However, in adolescents, this 292 association was not present. This discrepancy may indicate the existence of a compensatory mechanism 293 in adolescents that counterbalances the impact of circadian rhythm impairments on problem-solving 294 ability. More research is needed to confirm our hypothesis in the future.

295
In the present study, we demonstrated a causal effect of CARE on reasoning and memory abilities in 296 adults, and we also provided evidence for the shared genetic architecture of circadian rhythms with 297 neurological function [9]. The genome-wide significant locus was found to be associated with common

302
In conclusion, the new feature CARE that we derived from accelerometer data, is closely related to 303 the melatonin amplitude in natural settings, and also serves as a meaningful metric for a wide range 304 of cognitive functions in general adolescents and adults. Future studies with large sample size and 305 different protocols such as forced desynchrony settings and shift work should be conducted to 306 validate the new feature CARE and to confirm its causality on other health-related outcomes. reasoning ability, short-term and prospective memory were obtained from GWAS analysis after 509 excluding all individuals used in the GWAS analysis of CARE (n = 360,885). In the MR analyses, the 510 weighted median approach[69] was used as the main analysis due to its robustness to pleiotropy [70]. UTMOST is 2.9 × 10 −6 for 17,290 genes.

522
The melatonin and adolescent datasets used in the current study are not publicly available but are 523 available from the corresponding author (Fan Jiang) on reasonable request. The UK Biobank data that 524 support the findings of this study are available from the UK Biobank project but restrictions apply to 525 the availability of these data, which were used under license for the current study (application number: 526 57947), and so are not publicly available.