RT Journal Article SR Electronic T1 African ancestry neurodegeneration risk variant disrupts an intronic branchpoint in GBA1 JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.02.20.24302827 DO 10.1101/2024.02.20.24302827 A1 Jerez, Pilar Álvarez A1 Wild Crea, Peter A. A1 Ramos, Daniel M. A1 Gustavsson, Emil K. A1 Radefeldt, Mandy A1 Makarious, Mary B. A1 Ojo, Oluwadamilola O. A1 Billingsley, Kimberley J. A1 Malik, Laksh A1 Daida, Kensuke A1 Bromberek, Sarah A1 Hu, Carol A1 Schneider, Zachary A1 Surapaneni, Aditya L. A1 Stadler, Julia A1 Rizig, Mie A1 Morris, Huw R. A1 Pantazis, Caroline B. A1 Leonard, Hampton L. A1 Screven, Laurel A1 Qi, Yue A. A1 Nalls, Mike A. A1 Bandres-Ciga, Sara A1 Hardy, John A1 Houlden, Henry A1 Eng, Celeste A1 Burchard, Esteban González A1 Kachuri, Linda A1 , A1 Singleton, Andrew B. A1 Fischer, Steffen A1 Bauer, Peter A1 Reed, Xylena A1 Ryten, Mina A1 Beetz, Christian A1 Ward, Michael A1 Okubadejo, Njideka U. A1 Blauwendraat, Cornelis YR 2024 UL http://medrxiv.org/content/early/2024/02/24/2024.02.20.24302827.abstract AB Recently, a novel African ancestry specific Parkinson’s disease (PD) risk signal was identified at the gene encoding glucocerebrosidase (GBA1). This variant (rs3115534-G) is carried by ∼50% of West African PD cases and imparts a dose-dependent increase in risk for disease. The risk variant has varied frequencies across African ancestry groups, but is almost absent in European and Asian ancestry populations. GBA1 is a gene of high clinical and therapeutic interest. Damaging bi-allelic protein-coding variants cause Gaucher disease and mono-allelic variants confer risk for PD and Dementia with Lewy Bodies, likely by reducing the function of glucocerebrosidase. Interestingly, the novel African ancestry specific GBA1 risk variant is a non-coding variant, suggesting a different mechanism of action. Using full length RNA transcript sequencing, we identified intron 8 expression in risk variant carriers (G) but not in non-variant carriers (T). Antibodies targeting the N-terminus of glucocerebrosidase showed that this intron-retained isoform is likely not protein coding and subsequent proteomics did not identify a shorter protein isoform, suggesting the disease mechanism is RNA-based. CRISPR editing of the reported index variant (rs3115534) revealed that this is the responsible sequence alteration driving production of these intron 8 containing transcripts. Follow-up analysis of this variant showed that it is in a key intronic branchpoint sequence and therefore has important implications in splicing and disease. In addition, when measuring glucocerebrosidase activity we identified a dose-dependent reduction in risk variant carriers (G). Overall, we report the functional effect of a GBA1 non-coding risk variant, which acts by interfering with the splicing of functional GBA1 transcripts, resulting in reduced protein levels and reduced glucocerebrosidase activity. This understanding reveals a novel therapeutic target in an underserved and underrepresented population.Competing Interest StatementM.A.N.s participation in this project was part of a competitive contract awarded to Data Tecnica International LLC by the National Institutes of Health to support open science research. M.A.N. also currently serves on the scientific advisory board for Character Bio Inc. and is a scientific founder at Neuron23 Inc. M.R, S.F, C.B, and P.B, are employees of Centogene GmbH.Funding StatementThis work was supported in part by the Intramural Research Program of the National Institutes of Health including: the Center for Alzheimers and Related Dementias within the Intramural Research Program of the National Institute on Aging and the National Institute of Neurological Disorders and Stroke (1ZIAAG000538-03 ZIAAG000542-01 and 1ZIAAG000543-01). GP2 is funded by the Aligning Science Across Parkinsons (ASAP) initiative and implemented by The Michael J. Fox Foundation for Parkinsons Research (https://gp2.org). This research was funded in part by Aligning Science Across Parkinsons MJFF-024547 through the Michael J. Fox Foundation for Parkinsons Research (MJFF). The generation of molecular data for the TOPMed program was supported by the National Heart Lung and Blood Institute (NHLBI). RNA-seq for the NHLBI TOPMed Genes-Environments and Admixture in Latino Asthmatics Study (GALA II; phs000920) and Study of African Americans Asthma Genes and Environments (SAGE; phs000921) was performed at the Broad Institute Genomics Platform (HHSN268201600034I). WGS for the same studies was performed at the NYGC (3R01HL117004-02S3) and NWGC (HHSN268201600032I). Core support including centralized genomic read mapping and genotype calling along with variant quality metrics and filtering was provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Core support including phenotype harmonization data management sample identity quality control and general program coordination was provided by the TOPMed Data Coordinating Center (R01HL-120393 and U01HL-120393; contract HHSN268201800001I). WGS as part of GALA II was performed by the NYGC under a grant from the Centers for Common Disease Genomics of the GSP (UM1 HG008901). The GSP Coordinating Center (U24 HG008956) contributed to cross-program scientific initiatives and provided logistical and general study coordination. The GSP is funded by the National Human Genome Research Institute NHLBI and National Eye Institute. This work and E.G.B. were supported in part by the Sandler Family Foundation American Asthma Foundation Robert Wood Johnson Foundation Amos Medical Faculty Development Program Harry Wm. and Diana V. Hind Distinguished Professor in Pharmaceutical Sciences II NHLBI (R01HL117004 R01HL135156 X01HL134589 and U01HL138626) National Institute of Environmental Health Sciences (R01ES015794) National Institute on Minority Health and Health Disparities (R56MD013312 and P60MD006902) Tobacco-Related Disease Research Program (24RT-0025 and 27IR-0030) and National Human Genome Research Institute (U01HG009080). This research was funded in part by Aligning Science Across Parkinsons [ASAP 000478].Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:All generated LCL Coriell ONT DNAseq, CAGEseq and RNAseq data (ILM and ONT) is available at https://www.amp-pd.org/ via GP2 tier 2 access. AMP-PD ILM blood based RNAseq is available at https://www.amp-pd.org/ after signing the data use agreement. 1000 Genomes project data is publicly available at https://www.internationalgenome.org/. All generated brain tissue bulk RNAseq data (ONT) is currently being submitted to the NIMH data sharing platform at https://nda.nih.gov/. Summary statistics for cis-eQTLs and a catalog of ancestry-specific eQTLs from Kachuri et al.12 were obtained from https://doi.org/10.5281/zenodo.7735723. For additional details see "Ethics_Statement.docx" in the supplementary materialsI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll generated LCL Coriell ONT DNAseq, CAGEseq and RNAseq data (ILM and ONT) is available at https://www.amp-pd.org/ via GP2 tier 2 access. AMP-PD ILM blood based RNAseq is available at https://www.amp-pd.org/ after signing the data use agreement. 1000 Genomes project data is publicly available at https://www.internationalgenome.org/. All generated brain tissue bulk RNAseq data (ONT) is currently being submitted to the NIMH data sharing platform at https://nda.nih.gov/. Summary statistics for cis-eQTLs and a catalog of ancestry-specific eQTLs from Kachuri et al.12 were obtained from https://doi.org/10.5281/zenodo.7735723.