Longitudinal dynamics of mutant huntingtin and neurofilament light in Huntington's disease: the prospective HD-CSF study

The longitudinal dynamics of the most promising biofluid biomarker candidates for Huntington's disease (HD) - mutant huntingtin (mHTT) and neurofilament light (NfL) - are incompletely defined, but could help understand the natural history of the disease and how these biomarkers might help in therapeutic development and the clinic. In an 80-participant cohort over 24 months, mHTT in cerebrospinal fluid (CSF), and NfL in CSF and blood, had distinct longitudinal trajectories in HD mutation carriers compared with controls. Baseline analyte values predicted clinical disease status and subsequent clinical progression and brain atrophy, better than did the rate of change in analytes. Overall NfL was a stronger monitoring and prognostic biomarker for HD than mHTT. Nonetheless, mHTT possesses prognostic value and is a valuable pharmacodynamic marker for huntingtin-lowering trials.


Introduction
Despite knowledge of its monogenetic cause, no treatments have been shown to slow neurodegeneration in Huntington's disease (HD)1,2. However, multiple approaches aimed at lowering production of the causative mutant huntingtin protein (mHTT) are in human clinical trials3-5. The ultimate goaltreating mutation carriers early, to prevent disease onsetwill require prevention trials in premanifest HD mutation carriers (preHD).
Successful target engagement by the first targeted huntingtin-lowering therapeutic tested in HD patientsthe antisense oligonucleotide tominersen (formerly IONIS-HTTRx/RG6042)was demonstrated by dose-dependent mHTT reduction in cerebrospinal fluid (CSF) in a phase 1/2 trial3, quantified by ultra-sensitive immunoassay6. A reliable CSF to brain mHTT relationship has been established in animal studies7, 8. This notable success led to the first phase 3 trial of such a drug, whose primary outcomes are the Total Functional Capacity (TFC) score of the Unified Huntington's Disease Rating Scale (UHDRS) in the USA, and a composite UHDRS (cUHDRS) measure combining motor, functional, and cognitive scores in the EU9.
Such clinical rating scales quantify overt clinical manifestations, but are less sensitive to detect deterioration, or its therapeutic benefit, in preHD10-13 making their use as outcomes in prevention trials problematic. Though clinically relevant, they are also far removed from the core disease mechanism: neuronal injury by the HTT gene product. Quantifying biochemical manifestations of neurodegeneration can inform our understanding of pathobiology and the development and testing of novel therapies.
To this end, we recently showed that CSF levels of mHTTthe toxic pathogenic proteinand neurofilament light (NfL)an axonal protein indicative of neuronal injuryare among the earliest detectable changes in HD, and are strongly associated cross-sectionally with baseline measures of clinical severity and brain volume14. In the longitudinal Track-HD cohort, we showed that blood NfL level independently predicts subsequent onset, clinical progression, and brain atrophy in HD over three years15.

The HD-CSF Cohort
Seventy-four (92.5%) out of the eighty baseline participants returned for the 24-month followup assessments. Three (4%) out of the seventy-four opted out of doing the follow-up lumbar puncture but agreed to blood and phenotypic data collection (Figure 1). A more detailed version of the study flow is provided in Supplementary Fig. 1. Baseline visit (n=80) was performed 24-months (± 3 months) before the follow-up visit (n=74). Optional repeat sampling visits occurred 6-8 weeks after baseline. A more detailed version including all study assessments is provided in Supplementary Fig. 1.
Full cohort characteristics are presented in Supplementary Table 1. Disease groups were wellmatched for gender and differed as expected in HD clinical, cognitive and imaging measures.
Age differed significantly between groups due to the control group (50.68 years ± 11.0) being matched to all HD mutation carriers, and manifest HD (56.02 years ± 9.36) being more advanced in their disease course than preHD (42.38 years ± 11.04), as previously reported14.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31 To explore the 'genetic dose-response relationship' between the causative gene mutation and each biofluid measure we queried our models for the interaction between CAG and age in HD . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint Figure 2 The longitudinal dynamics of mHTT and NfL over 24  Prognostic value for overall HD progression of baseline analyte versus its rate of change We assessed the clinical associations of each analyte using the cUHDRS, a composite score derived from large natural history cohorts and combining motor, functional and cognitive symptoms to reflect overall HD clinical severity across clinically important domains18. All three analytes had significant associations with cUHDRS cross-sectionally at both baseline and follow-up (Extended Data Fig. 1). To assess the prognostic value of each analyte for HD progression, we first examined whether their baseline values predicted subsequent change in cUHDRS. Significant associations with subsequent cUHDRS change were found for all three (CSF mHTT r=-0.31, 95%CI -0.57 to -0.03, p=0.026; CSF NfL r=-0.38, 95%CI -0.52 to -0.18, p<0.0001; plasma NfL r=-0.47, 95%CI -0.63 to -0.25, p<0.0001; Figure 3a-c). The association with baseline plasma NfL remained significant after adjustment for age and CAG (CSF mHTT r=-0.11, 95%CI -0.48 to 0.18, p=0.513; CSF NfL r=-0.21, 95%CI -0.48 to 0.00, p=0.098; plasma NfL r=-0.33, 95%CI -0.58 to -0.08, p=0.011).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint In our analysis of fast and slow progressors (≥ or < 1.2 decline in cUHDRS respectively), baseline CSF mHTT, CSF NfL and plasma NfL were all significantly higher in faster progressors (mean differences: CSF mHTT 19 We next used random forest analysis to compare head-to-head and illustrate the relative importance ranking of the three biofluid analytesincluding their baseline value and annualised rate of changein prediction of HD clinical progression, alongside other established predictors (age, CAG and DBS). Using change in cUHDRS as a continuous outcome, baseline values ranked as stronger predictors than annualised rates of change ( Figure 3m). Similar results were obtained for prediction of fast versus slow progression ( Supplementary Fig. 5a).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint Figure 3 Longitudinal associations of mHTT and NfL with disease progression quantified by cUHDRS.   is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
In contrast, the rate of change in each analyte had weaker associations with progression in every measure apart from change in TMS (mHTT r=0.46, p<0.0001; CSF NfL r=0.32, p=0.012). These associations remained after adjustment for age and CAG (r=0.43, p=0.001; r=0.18, p=0.032 respectively).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Using a receiver operating characteristics (ROC) curve analysis, we compared the discriminatory ability of each analyte's baseline value and rate of change, to distinguish between different clinical states: controls versus HD mutation carriers, and between premanifest versus manifest HD. For all three analytes, rate of change had poor ability to distinguish in either comparison; areas under the curves (AUCs) were approximately 0.5 (i.e., no better than chance). Baseline concentrations had excellent discriminatory ability with AUCs greater than 0.8 (Figure 4b,). In each condition, and for each analyte, the AUC for the baseline measurement was significantly greater than that for its rate of change (Figure 4b,c).
To compare the relative prognostic ability of the biofluid and imaging biomarkers within a single model, we repeated the random forest analysis, including only the baseline values for biofluid analytes and imaging biomarkers. Using change in cUHDRS as a continuous outcome, biofluid analytes ranked as stronger predictors than imaging biomarkers ( Figure 4d). Similar results were obtained for prediction of fast and slow progression ( Supplementary Fig. 4b).

Simulating clinical trials with biofluid biomarker surrogate endpoints
The data thus far suggests that these analytes indicate current clinical state and have prognostic value for clinical decline. We used longitudinal data from the HD-CSF cohort to run computationally simulated clinical trials using CSF mHTT, CSF NfL and plasma NfL as possible surrogates for clinical progression. These simulations assume that the intervention-. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint induced change in the analyte emulate the change expected in clinical state by an intervention. Figure 5 depicts the relationships between statistical power, sample size and trial duration for such trials, using a nominal 20% drug effect per year on the biomarker trajectory. For longitudinal change in NfL, in CSF or plasma, fewer than 100 participants per arm are needed to show an effect over 9 months. More than 10,000 participants per arm would be required to achieve 80% power for a similar trial using the lowering of CSF mHTT as a surrogate outcome over 24 months. Note that this calculation is based on mHTT release and does not apply to any trial in which the intervention reduces its production directly and lowers the protein below   Figure 2). The main effect in each simulation repetition was calculated as the inter-arm mean difference in the mean change from baseline, using generalised linear models adjusted for CAG. Statistical power was calculated as the proportion of trial simulations with a p-value < 0.05 for the main effect. CSF; cerebrospinal fluid; mHTT; mutant huntingtin; NfL; neurofilament light.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint Technical validation of cross-sectional baseline results across assays CSF mHTT, CSF NfL and plasma NfL were re-measured in baseline samples using the same methods used for the follow-up samples, in order to perform the longitudinal analysis.
Comparing the re-measured values with those previously published at baseline, batch, assay or storage effects did not affect our samples (Supplementary text; Supplementary Fig. 7). We used the re-measured data to replicate our previously published cross-sectional findings14.
This included each analytes' inter-group differences (Extended Data  Table 3).

Replication of cross-sectional results in follow-up data
We used the samples and data from the 24-month follow-up to examine whether our crosssectional findings held true in the same cohort two years later (Extended Data Fig. 7-10). All results for NfL were similar to those previously published and to the re-measured baseline data. For CSF mHTT, we replicated the stronger associations with clinical measures and found further stronger associations with all brain volumes, similar to those for NfL (Extended Data . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion
Here, we present the 24-month results of HD-CSF, a longitudinal study of biofluid biomarkers in HD mutation carriers and matched controls, with longitudinal clinical and MRI data. We have characterised and compared the longitudinal dynamics of mHTT and NfL in CSF in HD for the first time, defining the trajectories of these biofluid biomarkers and the inflection points at which they depart from healthy controls in the natural history of HD. While rates of change in the analytes had some prognostic value, a single measurement at baseline of each analyte exhibited stronger ability to predict subsequent clinical decline, brain atrophy and disease state. How our novel prognostic findings add to what was known about these biomarkers is summarised in context in Table 1. Using clinical trial simulations, we showed that NfL could be used as an outcome measure of neuronal protection and disease progression, to run trials of feasible duration. mHTT in CSF, and NfL in CSF and plasma, all rose detectably within participants over 2 years.
Over the whole course of the disease, mHTT increases rose linearly with age, whereas NfL rose in a more sigmoidal pattern. The disease-associated rise in NfL was more consistent while mHTT was more variable within individuals. The NfL trajectory was distinct from that in healthy controls, with little overlap. This suggests that monitoring these biomarkers against an age-relevant reference range derived from the healthy population could be clinically meaningful. The dynamics of both CSF mHTT and NfL were also CAG-dependent, revealing longitudinally the genetic dose-response relationships that we demonstrated previously for plasma NfL in the TRACK-HD cohort15. Change-point analysis identified the approximate age and analyte concentration at which HD mutation carriers became detectably different from controls, for a given CAG repeat length. Defining these points of deflection from the trajectories in healthy controls may help us move towards models based on CAG repeat length that could be used to enrich or stratify clinical trial participants and could eventually be used to personalise treatment approaches19.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint For the first time to our knowledge, we assessed the clinical prognostic potential of biofluid biomarkers against the cUHDRSa composite measure derived from large cohort datasets to have high signal-to-noise ratio as a longitudinal measure of disease progression18. mHTT and NfL concentrations predicted change in cUHDRS, affirming their potential as biomarkers of HD progression. We show that both CSF mHTT and NfL each possess prognostic value for subsequent clinical decline and brain atrophy, as seen previously for plasma NfL in the TRACK-HD cohort.
For rate of change in analytes, there was an association with change in TMS but no other measures. That the rate of change in each analyte had lesser prognostic and discriminatory power than their baseline values may appear surprising. Longitudinal studies of NfL in other genetic neurodegenerative diseases, including dominantly inherited Alzheimer's disease20,21 (AD) and frontotemporal dementia22 (FTD), have revealed that the rate of change in NfL was a stronger predictor of disease progression, with accelerated rate of change in those who converted from presymptomatic to symptomatic. The HD-CSF study was not designed to assess predictors of conversion from premanifest to manifest HD, but the rate of change in plasma NfL did show significant prognostic value in our comparison of fast versus slow progressors. NfL is an axonal protein but not specific to neuronal sub-populations or to a given disease pathology. It is likely that each disease exhibiting neuronal dysfunction will have a distinct longitudinal NfL profile. It is notable that HD has some of the highest elevated levels of CSF NfL compared to other neurological diseases studied to date, greater than levels in AD and FTD which have more rapid clinical progression23. A baseline measure encompasses the totality of a disease's effects up to the point in a person's life; by comparison, in a slowly progressive disease, even a 2-year change value captures a relatively small further difference.
The relative importance rankings generated from unbiased random forest analyses suggest that a single measurement of plasma NfL has equivalent, if not superior, prognostic value to that of disease burden scoreone of the strongest predictors of HD progression. Further, the baseline biofluid biomarkers were stronger predictors of clinical progression than were brain . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint volumes. This is supportive of a single measurement of biofluid biomarkers being closely indicative of the dynamic pathological processes driving disease progression.
Our computational clinical trial simulations offer a novel means to plan future clinical trials that would use the lowering of these biomarkers as surrogate endpoints. They suggest that a trial using plasma NfL with as few as 100 participants per arm run over six months would have over 90% power to show 20% slowing of the expected longitudinal trajectory of the analyte and, under our assumptions, the slowing of clinical decline. Using CSF NfL, the same trial would need to be run over nine months to achieve the same power. This is striking when compared to the several hundred participants per arm required to achieve the same effect size over 24 months for both cUHDRS and TFCcurrent clinical end points of the ongoing phase 3 huntingtin-lowering trial9. A caveat here is that we assume that the effect size lowering in NfL would be equivalent to the same effect size for improvement in clinical outcome. Until there is a clinically efficacious intervention for HD, this hypothesis cannot be tested. Because of a slower rate of increase, and more importantly, the intra-and inter-subject variability in the change over time, in our simulations a much larger participant numbers would be needed to show a deflection in the trajectory of mHTT as a surrogate outcome (not a pharmacodynamic outcome) by slowing disease progression alone. However, given the existing associations of CSF mHTT as a prognostic and pharmacodynamics response marker, it is likely that this measure too will turn out to have some predictive utility in the drug trial context. NfL is less variable and appears to be a better marker of HD progression and prognosis than mHTT.
However, it is important to reiterate that mHTT retains its intrinsic value as a direct measure of the causative neurotoxin and as a means of assessing the on-target effects of huntingtinlowering agents. To achieve this purpose, very small participant numbers are required, as shown by our previous cross-sectional power calculations and findings from the first human trial of such an agent3,14. Our novel finding that lower mHTT concentrations predict more slower progression is potentially important in this respect. Importantly, we also show that longterm freezer storage of samples does not adversely impact quantification of mHTT and NfL.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint In our previously published baseline analysis14, CSF mHTT concentration was not crosssectionally associated with any brain volume measure. However, in our replication of crosssectional analyses we found associations between CSF mHTT and caudate and grey-matter volume at baseline, and strong associations with all brain volumes with the samples from the 24-month collection. This was likely because the performance of the mHTT assay, which has been enhanced through its use in clinical trial programs, has improved so as to reduce variability in the latest analysis. Further, baseline CSF mHTT values also predicted subsequent brain atrophy, confirming that mHTT level in CSF per se has prognostic potential in HD, beyond its intrinsic appeal as a therapeutic target. It remains important to note that several participants' mHTT values were below the lower level of quantification of the assay, an indication that more sensitive HTT assays will be needed, especially in the realm of preHD and prevention trials.
Despite being, to our knowledge, the largest longitudinal natural history cohort CSF collection with matched MRI data in HD, the sample number remains modest. We lack the granularity to is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint that given the range in HD clinical severity, the variability is larger in our cohort, and we used a conservative effect size, all of which drive larger sample size.
Efforts are well underway to address these issues: HDClarity, a multi-national CSF collection initiative for HD has amassed over 600 CSF and plasma samples across the disease spectrum and is now accumulating longitudinal samples over repeated annual intervals (NCT02855476).
These insights into the longitudinal dynamics of mHTT and NfL shed light on the biology of HD in human mutation carriers and will be of immediate value in the design and conduct of disease-modifying clinical trials, especially as we enter the era of prevention trials where qualified surrogate endpoints will be fundamental24. Looking ahead, some centres are already incorporating blood NfL measurement into shared clinical decision-making in neurological disease19. Continued study may reveal a role for mHTT and NfL in guiding decision-making for individuals living with HD. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint Methods A preprint was submitted to medRxiv on 27th March 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Sample collection and processing
Sample collection and processing were as previously described29. All collections were standardised for time of day after overnight fasting and processed within 30 minutes of collection using standardised equipment. Blood was collected within 10 minutes of CSF and processed to plasma.

Analyte Quantification
CSF and plasma NfL were quantified in duplicate using the Neurology 4-plex B assay on the Simoa® HD-1 Analyzer (Quanterix, USA), per manufacturer's instructions. A 4x dilution for blood samples was performed automatically by the HD-1 Analyser and CSF samples were manually diluted 100x in the sample diluent provided prior to loading onto the machine. The limit of detection (LoD) was 0.105 pg/mL and lower limit of quantification (LLoQ) 0.500 pg/mL. NfL was over the LLoQ in all samples. The intra-assay coefficient of variance (CV) (calculated as the mean of the CVs for each sample's duplicate measurements) for CSF NfL and plasma NfL was 5.0% and 3.7% respectively. The inter-assay CVs (calculated as the mean of the CVs for analogous spiked positive controls provided by the manufacturer and used in each well plate) for CSF NfL and plasma NfL were 2.7% and 8.4% respectively. We previously quantified NfL in the same baseline samples using an ELISA (NF-Light®, UmanDiagnostics, Sweden) in CSF and 1-plex Simoa® kit (NF-Light®, Quanterix) in plasma14. In both biofluids, agreement between assays was good ( Supplementary Fig. 2).
CSF mHTT was quantified in triplicate using the same 2B7-MW1 immunoassay as at baseline (SMCTM Erenna® platform, Merck, Germany)6. The LoD was 8fM and LLoQ 25fM. All control samples were below the LoD of the assay except one subject's baseline re-measured sample.
These were imputed as 0 fM for analysis purposes. 27 (21%) samples were below the LLoQ and were included in subsequent analyses. One preHD had CSF mHTT below the LoD in their . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint re-measured baseline sample and was not included in the analyses. The intra-assay CV for CSF mHTT was 14.1%. Haemoglobin contamination was quantified using a commercial ELISA (E88-134, Bethyl Laboratories, USA) by Evotec. Only 1 sample (2.186 g/mL) had haemoglobin just over the 2 g/mL recommended threshold30.
Assays were run using same-batch reagents, blinded to clinical data.

MRI Acquisition
The MRI acquisition protocol was identical to that used at baseline14. T1-weighted MRI data were acquired on a single 3T Siemens Prisma scanner using a protocol optimized for this study. The parameters were as follows: Images were acquired using a 3D magnetization-

MRI Processing
Predefined regions-of-interest for volumetric analysis included the caudate, white matter, grey matter and whole brain. All baseline volumes were re-calculated at follow-up. Bias correction was performed on all scans prior to processing using the N3 procedure31. All scans, segmentations and registrations underwent visual quality control blinded to group status to ensure successful processing. All T1-weighted scans passed visual quality control check for the presence of significant motion or other artefacts before processing; one scan failed quality control due to the presence of significant motion, meaning that 57 scans were processed. As described previously, a semi-automated segmentation procedure was performed via Medical Image Display Analysis Software (MIDAS)32 to generate volumetric regions of the whole-brain and Total Intracranial Volume (TIV) at baseline14. Changes in whole-brain and caudate were . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint calculated via the Boundary Shift Integral (BSI) method33,34. The BSI is a semi-automated technique applied within MIDAS that quantifies change over time in regions of interest. For the whole-brain, baseline and follow-up scans were segmented with MIDAS via a morphological segmentor that uses the application of operator-driven thresholds and erosions and dilations to separate brain tissue from the scalp and CSF32. The baseline and follow-up scans were then registered using 12 degrees-of-freedom and the BSI metrics were calculated for each participant35. One scan failed registration and thus was excluded from the measures of whole- Registration failed for three datasets, resulting in the analysis of 55 scan pairs.
Cross-sectional data from the follow-up time point were used to replicate the baseline results.
Follow-up whole-brain volume was measured via the semi-automated procedure described at baseline. Follow-up caudate volume was computed by the baseline volume minus the amount . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint of atrophy measured via the CBSI, and follow-up grey and white matter volumes were calculated by subtracting the amount of atrophy from baseline volumes.

Statistical Analysis
As previously observed14,15, NfL distributions were right-skewed, therefore log-transformed values were used for analytical purposes. Due to their known effects on HD, all models included age and CAG repeat count as covariates.
Cross-sectional analyses: To validate previous findings and compare assays, we replicated the cross-sectional analyses from the study baseline14, using re-measured baseline data and newly collected 24-month follow-up data. To investigate intergroup differences, we applied generalised linear regression models estimated via ordinary least squares, with analyte concentration as the dependent variable, and group membership, and age as independent variables and then with group membership, age and CAG as independent variables. To study associations in HD mutation carriers between the analytes and clinical or imaging measures we used Pearson's partial correlations adjusted for age and for age and CAG. Bias-corrected and accelerated bootstrapped 95% confidence intervals (95% CI) were calculated for mean differences and correlation coefficients. To understand the discriminatory power of the studied analytes, we produced receiver operating characteristics (ROC) curves for each analyte to differentiate healthy controls from HD mutation carriers, and premanifest from manifest HD and compared areas under the curves (AUC), formally using the method suggested by DeLong and colleagues42.
Longitudinal modelling: For modelling analyte trajectories over time generalised mixed effect models were performed, estimated via restricted maximum likelihood, with analyte concentration as the dependent variable. Independent models were developed for healthy controls and HD mutation carriers. Only HD mutation carriers were modelled for mHTT. For CSF mHTT in HD mutation carriers, the model had fixed effects for age and CAG, a random intercept per participant and a random slope for age. A similar model was used for healthy . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03.31.20045260 doi: medRxiv preprint controls for NfL in CSF and plasma. HD mutation carriers were modelled with fixed effects for age (second-order) and CAG, and random slopes for age were included for both CSF and plasma NfL.
Change-point analysis: We use an offline Bayesian change-point algorithm to estimate the most likely disease time at which a given biomarker changes from a normal to abnormal state. As we want to estimate the point of change from normality to abnormality, we use data from all groups (control, preHD, and HD) to fit the model over each time segment.
Rates of change simulations: The longitudinal models above were used to estimate rate of change from simulated data. Model parameters, age and CAG distributions, and sample sizes were mimicked from the HD-CSF cohort. Each simulation was repeated 1,000 times and run independently for each analyte for each participant subgroup (i.e. healthy controls, premanifest and manifest HD).
Associations of the analytes' baseline values, and of their rates of change, with clinical and imaging changes, were assessed using Pearson's partial correlations adjusted for age, and for age and CAG. Rates of change were computed as the 24-month follow-up value minus the baseline value divided by the follow-up time in years. Bias-corrected and accelerated bootstrapped 95%CI were calculated for correlation coefficients and mean differences. To further explore clinical prognostic value, we divided mutation carriers into nominally "fast" and "slow" progressors at the previously-described cUHDRS minimal clinically important difference for decline (absolute 1.2-point reduction)28. Intergroup differences were investigated with generalized linear regression estimated via ordinary least squares, with analyte concentration or rate of change as dependent variable, and group membership and age, and then group, age and CAG as independent variables. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not peer-reviewed)
The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint Receiver operating characteristic (ROC) curves were produced and areas under the curves (AUC) compared for the ability of analytes' baseline values and rates of change to differentiate healthy controls from mutation carriers, and preHD from manifest HD using the method of DeLong42.
Clinical trial simulation: We used 1,000 repetitions, a parallel design without attrition or placebo effect, a pseudo-control arm emulating the observed longitudinal trajectories, and an intervention arm with constant 20% annualized reduction in the analyte of interest. Synthetic datasets were generated with Monte Carlo simulations using mixed effect models matching the longitudinal models above. Main effects were estimated as inter-arm mean difference in the mean change from baseline, adjusted for CAG using generalized linear models estimated as above.

Event-based modelling
We used an event-based model (EBM)44 to estimate the most likely sequence of biomarker changes and to stage participants at both baseline and follow-up. In brief, the EBM is a probabilistic model of observed data generated by an unknown sequence of biomarker events, where an event is defined as a biomarker transitioning from a normal to an abnormal state.
The model learns the biomarker distributions of normality and abnormality directly from data, and hence estimates the most likely sequence of abnormality over the whole population. The EBM has been applied extensively to several progressive neurological diseases, including Alzheimer's disease, multiple sclerosis and HD45-47.
We recently developed an EBM for HD biofluid, neuroimaging and clinical biomarkers using baseline data from the HD-CSF cohort14. Here we refit the model using baseline data from participants who are present at both baseline and follow-up, and use this model to both test the sequence of events estimated in Byrne et al., (2018)14, and to stage participants at both time-points. Specifically, mixture models47 were fit to distributions of healthy control and . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Random forest analyses
To complement our analyses, we used random forest methodology, a supervised ensemblelearning approach based on building multiple independent decision trees from bootstrapped samples48,49. This allows the detection of non-linear relationships, and simultaneous ranking of predictors of clinical change in multivariate analyses. We implemented random forests on the following outcomes: annualised rate of change in cUHDRS and cUHDRS as a binary outcome: "fast" and "slow" progressors defined as above. Models were first run with biofluid biomarkers including their baseline value and annualised rate of change, and then with biofluid and imaging biomarkers baseline value as predictor variables. Age, CAG repeat count, and DBS were included in all models to serve as comparators variables.
Each random forest had 1,000 trees and was based on bootstrapped samples with replacement and three randomly sampled predictor variables were considered for splitting each node. Relative importance rankings were based on the mean decrease Gini score across all trees, where a higher mean decrease in Gini indicates greater predictor variable relative importance at predicting outcomes. To explicitly test the stability of our results and generate ranking distributions, we re-ran the model 100 times, each containing 80% of the possible observations (randomly selected). Random forests were implemented using R randomForest package49.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Role of funding source
Funders had no role in study design, data collection, analysis, or interpretation, or writing of the report. The corresponding author had full access to data and final responsibility for the decision to submit for publication.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
Association within HD mutation carriers (n=60) between CSF mHTT (green; a, d, g, j), CSF NfL (blue; b, e, h, k), plasma NfL (red; c, f, i, l) and UHDRS clinical scores including functional (a-c), motor (d-f) and cognitive (g-l) . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint Extended Data Fig. 5 Cross-sectional associations in re-measured baseline samples between analyte concentrations and imaging measures.
Association within HD mutation carriers between the analytes CSF mHTT (green; a, d, g, j), CSF NfL (blue; b, e, h, k), plasma NfL (red; c, f, i, l) and MRI volumetric measures whole-brain (n=48; a-c), white-matter (n=49; d-f), grey-matter (n=49; g-i) and caudate (n=43; j-l). All volumetric measures were calculated as a percentage of total intracranial volume. Scatter plots show unadjusted values. r and p values are age-adjusted, generated from . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31.20045260 doi: medRxiv preprint Extended Data Fig. 9 Cross-sectional associations in 24-month follow-up baseline samples between analyte concentrations and imaging measures.
Association within HD mutation carriers between the analytes CSF mHTT (green; a, d, g, j), CSF NfL (blue; b, e, h, k), plasma NfL (red; c, f, i, l) and MRI volumetric measures whole-brain (n=43; a-c), white-matter (n=41; d-f), grey-matter (n=41; g-i) and caudate (n=43; j-l). All volumetric measures were calculated as a percentage of total intracranial volume. Scatter plots show unadjusted values. r and p values are age-adjusted, generated from . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.03. 31 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.