Classification of GBA1 variants in Parkinson disease; the GBA1-PD browser

Background: GBA1 variants are among the most common genetic risk factors for Parkinson Disease (PD). GBA1 variants can be classified into three categories based on their role in Gaucher Disease (GD) or PD: severe, mild, and risk variant (for PD). Objectives: This paper aims to generate and share a comprehensive database for GBA1 variants reported in PD to support future research and clinical trials. Methods: We performed a literature search for all GBA1 variants that have been reported in PD. The data has been standardized and complimented with variant classification, Odds Ratio (OR) if available and other data. Results: We found 371 GBA1 variants reported in PD: 22 mild, 84 severe, 3 risk variants, and 262 of unknown status. We created a browser, containing up-to-date information on these variants (https://pdgenetics.shinyapps.io/GBA1Browser/). Conclusions: The classification and browser presented in this work should inform and support basic, translational, and clinical research on GBA1-PD.


INTRODUCTION
The risk, onset, and progression of PD are influenced by a multitude of factors including aging, environmental exposures, and genetic background. 1 The underlying pathobiological mechanisms influencing PD risk are not fully understood; however, multiple genetic loci for PD risk have been identified in the human genome, as well as genes involved in rare Mendelian forms of PD. Notable genes implicated in PD include: SNCA, LRRK2, PRKN, and PINK1, among others. 1 One of the most important genes in PD is GBA1, as 5-20% of PD patients in different populations carry variants in this gene. 2 GBA1 encodes the lysosomal enzyme glucocerebrosidase (GCase), responsible for hydrolyzing glucosylceramide and glucosylsphingosine. 3 Pathogenic biallelic GBA1 variants cause a lysosomal storage disorder, Gaucher Disease (GD), which can be classified as type I (mild, non-neuronopathic form of GD), type II or type III (severe, neuronopathic forms of GD). Accordingly, GBA1 variants can be classified as mild or severe, based on the type of GD that they lead to in a homozygous state. 2 The association between GBA1 variants and PD originates from clinical observations, reporting that some GD patients had been also displaying clinical signs of PD. [4][5][6] Genetic studies subsequently showed that GBA1 variants are common risk factors in PD in various populations 2,7 and that the type of GBA1 variants, mild or severe, is associated with differential risk and progression of PD. Carriers of severe GBA1 variants have higher risk for PD, earlier age at onset 2 , and their motor and cognitive decline is faster. 8,9 However, the majority of GBA1 variant carriers do not develop PD, as the penetrance of heterozygous GBA1 variants in most PD populations ranges between 10-30%. [10][11][12] This is much higher compared to a 6.6% reported life-time risk for PD observed in the overall population PD, [13][14][15][16] yet the mechanism by which GBA1 variants cause or increase the susceptibly for PD is still unknown. 17 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint Since the association was first established between GBA1 variants and PD, there has been extensive research on genotype-phenotype correlations. The list of GBA1 variants reported in PD cases has grown overtime, and is not fully overlapping that of GD. Notably, the p.E326K and p.T369M variants, which do not cause GD, are risk factors of PD. 18,19 Since more and more clinical trials targeting individuals with PD and GBA1 variants are being performed and planned, it is important to gather data on GBA1 variants to inform the design of these trials and other clinical and functional studies. For example, since carriers of severe GBA1 variants are likely to progress faster, it will be important that they will be equally represented in the treatment and placebo arms of trials. Here, we compiled a list of all GBA1 variants reported in individuals with PD to date. We generated an online browser to mine data on these variants, including the severity if known among other important information, and we will continue to update this resource.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint

Literature Search Criteria
For the purposes of creating the GBA1-PD browser, we searched for all studies that reported GBA1 variants in PD populations. An initial search on PubMed was done including the variations of the following keywords: "GBA," "GBA1," and "glucocerebrosidase," and "Parkinson's," and "parkinsonism." The search included papers published from the year 2004, when GBA1 was first suggested as a probable risk factor for PD onset, 20 up to April 2022, when the literature search was conducted. As a result of this initial search, 1128 papers were found. After removing meta-analyses and review papers from the list, the remaining 834 papers were thoroughly screened. The screening then filtered for studies involving casecontrol or case-only PD populations with data on common and rare GBA1 variants. The final list after this second screening step consisted of 86 papers in total ( Figure 1).

Quality Control
GBA1 variants collected from the final list of literature were then validated based on the sequencing data of the GBA1 gene in ensemble.org. 21 In addition, variant information was revised according to the Human Genome Variation Society (HGVS) guidelines for variants nomenclature. 22

Clinical Classification of GBA1 variants
As mentioned earlier, the classification of GBA1 variants as mild or severe is based on GD: mild mutations cause GD type I and severe mutations cause GD type II or III. This classification is also important in PD, as carriers of severe GBA1 variants have a higher risk of PD, earlier AAO, 2 and faster cognitive and motor decline. 8,9,23 There are also GBA1 variants that do not cause GD but are associated with increased risk of PD, such as p.E326K and p.T369M. 18,19 We therefore performed a literature search for each variant to find if it was . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint reported in GD and if its severity was interpretable. We then classified the variants accordingly as mild (causing GD type I), severe (causing GD type II or III), risk variant (variants that are associated with risk of PD but do not cause GD), and unknown.

Construction of browser and data sharing
In order to make the information openly accessible, we built the GBA1-PD Browser (https://pdgenetics.shinyapps.io/GBA1Browser/). This browser was built using R Shiny and includes specific information pertaining to each variant reported in PD as follows: variant name, full length name, clinical classification (i.e. mild, severe, risk variant, or unknown), rsID, genome base pair position (hg19), exon number, allele frequency in gnomAD, 24 CADD PHRED-scaled score, 25 GERP scores, 26 and the manuscript that reported the variant. The data was generated and compiled before being added into the browser.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint

Identification and classification of GBA1 variants
From the 86 studies that reported GBA1 variants in PD cases, we identified 371 unique variants reported to date (https://pdgenetics.shinyapps.io/GBA1Browser/). Of the 371 variants, 22 were identified as mild (type-I GD causing) variants, 84 were identified as severe (type-II and III GD causing) variants and 3 were identified as non-GD risk variants for PD.
The risk variants found were biallelic and heterozygous forms of p.E326K, p.T369M, and p.E388K. The remaining 262 were classified as unknown due to lack of information on their GD pathology and/or PD risk associations. Figure 2 depicts the distribution of GBA1 variants per exon, and the location of the most common variants associated with PD. Among the 86 studies included in this analysis, 16 reported variant specific odds ratio (OR) data for PD risk ( Table 1).

The GBA1-PD browser
The GBA1-PD browser is a public-facing database created to assist researchers in finding GBA1 variants relevant to PD risk. The previously described 371 variants can be searched and filtered through using hg19/hg38 position, protein consequence, rsID, clinical classification, and exon number. An interactive plot displays the location of the included variants in GBA1, and groups variants by their severity. Additional relevant information including reported allele frequency in gnomAD, CADD score, GERP score, and the reporting manuscript are available for each variant.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

DISCUSSION
The GBA1 browser generated through this work could serve as a valuable resource for researchers, clinicians, and for design of clinical trials. GBA1 variant carriers are at higher risk for PD with a penetrance of 10-30%, 10-12 compared to the overall population who carry up to 6.6% lifetime risk for developing PD. [13][14][15][16] The latter estimation includes carriers of GBA1 variants, therefore the lifetime risk for non-carriers of GBA1 variants is likely lower. There are notable differences across ancestries as to how common GBA1 variant carriers are. For instance, the prevalence of GBA1 variants in the Ashkenazi Jewish PD population is around 20%, 2, 27 whereas for Chinese PD cases it is found at a much lower prevalence rate of around 5.4-8.4%. [28][29][30] The frequencies of specific GBA1 variants also differ across populations. The most frequent GBA1 variants in Ashkenazi Jews with PD is p.N370S, while in European populations it is mostly p.E326K or p.T369M, and in Asian populations it is p.L444P. 2,24 These different variants also represent different categories of variants, classified based on their effect in GD. The p.N370S variant is a mild variant associated with GD type I, 31,32 p.E326K and p.T369M are risk variants for PD that do not cause GD, 19,33,34 and p.L444P is a severe variant associated with the severe form of GD. [35][36][37] While the odds ratio for PD associated with risk variants (i.e. p.E326K and p.T369M) is below 2, 19, 33 the odds ratio of mild GBA1 variants is above 2, and the odds ratio of severe GBA variants may reach above 10. 2 More importantly, the variants may have different effects on PD progression, as motor symptoms and cognition seem to decline faster among carriers of severe GBA1 variants. 8,9,23 These observations could be especially important for clinical trials on GBA1-PD. If the treatment and placebo arms of a trial will not be balanced in terms of the composition of severe and mild variants in each arm, it is possible that one arm will progress faster than the other, which may lead to either false positive or negative results for a trial. Furthermore, trials that only include severe GBA1 variant carriers may require a shorter trial duration and a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 27, 2022. ; https://doi.org/10.1101/2022.09.27.22280421 doi: medRxiv preprint smaller trial population, while trials that include mild or risk variant carriers may require a longer trial duration and a larger trial population. The GBA1-PD browser we generated could be helpful for designing such trials, and it will be kept up to date as more information becomes available on GBA1 variants in GD and PD.

ACKNOWLEDGMENTS
We thank those who have been a part of the referenced clinical studies that form the basis of this paper and its accompanying online database. This work has been supported by grants

FINANCIAL DISCLOSURES OF ALL AUTHORS OF THE PRECEDING 12 MONTHS
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 27, 2022. ; https://doi.org/10.1101/2022.09.27.22280421 doi: medRxiv preprint