High-throughput PRPF31 variant characterisation pipeline consistent with ACMG/AMP clinical variant interpretation guidelines

Mutations in PRPF31 are the second most common cause of the degenerative retinal condition autosomal dominant retinitis pigmentosa. Difficulty in characterising missense variants in this gene presents a significant challenge in providing accurate diagnosis for patients to enable targeted testing of other family members, aid family planning, allow pre-implantation diagnosis and inform eligibility for gene therapy trials. With PRPF31 gene therapy in development, there is an urgent need for tools for accurate molecular diagnosis. Here we present a high-throughput high content imaging assay providing quantitative measure of effect of missense variants in PRPF31 which meets the recently published criteria for a baseline standard in vitro test for clinical variant interpretation. This assay utilizes a new and well-characterized PRPF31+/- human retinal cell line generated using CRISPR gene editing, which allows testing of PRPF31 variants which may be causing disease through either haploinsufficiency or dominant negative effects, or a combination of both. The mutant cells have significantly fewer cilia than wild-type cells, allowing rescue of ciliogenesis with benign or mild variants, but do not totally lack cilia, so dominant negative effects can be observed. The results of the assay provide BS3_supporting evidence to the benign classification of two novel uncharacterized PRPF31 variants and suggest that one novel uncharacterized PRPF31 variant may be pathogenic. We hope that this will be a useful tool for clinical characterisation of PRPF31 variants of unknown significance, and can be extended to variant classification in other ciliopathies.


1
Introduction 32 Ciliopathies are a broad range of inherited developmental and degenerative diseases associated with 33 structural or functional defects in motile or primary non-motile cilia (Oud et al. 2017). Motile 34 ciliopathies, such as primary ciliary dyskinesia, commonly present with severe respiratory problems 35 and situs defects. Primary non-motile ciliopathies include both syndromic multi-organ conditions, 36 such as Joubert syndrome and Alström syndrome, as well as single-organ disorders such as 37 polycystic kidney disease and some forms of retinitis pigmentosa and Leber congenital amaurosis 38 which only affect the retina. Common clinical features of these non-motile ciliopathies include retinal 39 degeneration and kidney disease; around one third of all cases of retinal dystrophy can be considered 40 retinal ciliopathies, arising as a result of defects in the photoreceptor cilium. Whilst individually rare, 41 collectively, ciliopathies are estimated to affect ~1:1000 people in the general population worldwide, 42 affecting ~67,500 people in the UK (Wheway et al. 2019a). However, this is likely to be an 43 underestimate, as ciliopathies are likely to be under-diagnosed. 44 Ciliopathies are genetic, mostly autosomal recessive, conditions. There are ~200 known ciliopathy 45 disease genes and it is expected that there are many more unidentified. Genetic testing can provide an 46 accurate diagnosis, but 24-60% of ciliopathy patients who undergo genetic testing do not receive a 47 genetic . This is at least in part due to the fact that following current guidelines from the American 49 College of Medical Genetics (Richards et al. 2015), it is difficult to provide a confident clinical 50 diagnosis of disease caused by missense or non-coding variants, which account for more than one 51 third of cases of disease. It is estimated that around 10% of ciliopathy patients in the UK have 52 plausibly pathogenic missense mutations in known disease genes which cannot be classified as 53 pathogenic following current ACMG guidelines because they lack sufficient supporting evidence (eg 54 segregation, recurrence, splicing etc). 55 The difficulty in classification stems from the requirement for labour-intensive functional studies, 56 and the lack of clarity in ACMG guidelines as to what constitutes a valid functional assay. Variant 57 Curation Expert Panels (VCEPs) have developed guidelines for valid functional assays for specific 58 conditions, but these vary widely from in vitro assays, splicing assays to animal model studies 59 . A recent publication ) outlines general guidelines for 60 assessing whether in vitro assays meet baseline standard for clinical variant interpretation, stating the 61 following criteria: 62 1.
The disease mechanism must be understood 63 2. Assays must be applicable to this disease and this disease mechanism 64 3. Normal/negative/wild-type AND abnormal/positive/null controls must be used AND multiple 65 replicates must be used 66 4. Variant controls must be known benign and known pathogenic 67 5. Statistical analyses must be applied to calculate the level of evidence for each variant 68 To facilitate standardized application of levels of evidence, Brnich et al 2019 provide tables for  69 calculating odds of pathogenicity values (OddsPath), with each OddsPath equating to a 70 corresponding level of evidence strength (supporting, moderate, strong, very strong) in keeping with 71 the ACMG/AMP variant interpretation guidelines (Richards et al. 2015). This provides a useful 72 framework for developing variant analysis pipelines, but the work involved in optimizing and 73 carrying out such robust in vitro assays is often beyond the scope of diagnostic labs, which do not 74 possess the time or resources to carry out such assays for all but the most common disease genes. It is 75 important for academic research laboratories to work with clinical diagnostic laboratories to develop 76 robust, reliable variant analysis pipelines which meet these criteria. This is particularly important as 77 increasing volumes of variants of unknown clinical significance are produced by genome sequencing, 78 which is being integrated into the UK National Health Service as a standard clinical service (Wheway 79 and author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.06.20055020 doi: medRxiv preprint rMATs v4.0.2 (rMATS turbo) (Shen et al. 2014) was used to statistically measure differences in 208 splicing between replicates of wild-type and mutant sequence. BAM files aligned with STAR v2.6.0a 209 two-pass method with soft clipping suppressed were used as input. 210

Protein extraction 211
Total protein was extracted from cells using 1% NP40 lysis buffer and scraping. Insoluble material 212 was pelleted by centrifugation at 10,000 x g. Cell fractionation was carried out by scraping cells into 213 fractionation buffer containing 1mM DTT and passed through a syringe 10 times. Nuclei were 214 pelleted at 720 x g for 5 minutes and separated from the cytoplasmic supernatant. Insoluble 215 cytoplasmic material was pelleted using centrifugation at 10,000 x g for 5 minutes. Nuclei were 216 washed, and lysed with 0.1% SDS and sonication. Insoluble nuclear material was pelleted using 217 centrifugation at 10,000 x g for 5 minutes. 218

SDS-PAGE and western blotting 219
20µg of total protein per sample with 2 x SDS loading buffer was loaded onto pre-cast 4-12% Bis-220 Tris gels (Life Technologies) alongside Spectra Multicolor Broad range Protein ladder (Thermo 221 Fisher). Samples were separated by electrophoresis. Protein was transferred to PVDF membrane. 222 Membranes were incubated with blocking solution (5% (w/v) non-fat milk/PBS), and incubated with 223 primary antibody overnight at 4 o C. After washing, membranes were incubated with secondary 224 antibody for 1 hour at room temperature and exposed using 680nm and/or 780nm laser (LiCor 225 Odyssey  and can be considered probably pathogenic (Table 1). Thirteen others potentially affect splicing and 291 can be considered possibly pathogenic (Table 1). Functional studies would be needed to clarify 292 effects of these, as ACMG guidelines only allows in silico splice tools as supporting evidence of 293 pathogenicity (Richards et al. 2015), but this is beyond the scope of this particular investigation. To 294 avoid investigating any variants which potentially affect splicing, we excluded all variants predicted 295 by Of the thirteen which are not predicted to affect splicing, ten are either within introns, or are 296 exonic but do not result in amino acid changes, and can therefore be considered probably benign 297 (Table 1). 298

In silico predictions of functional effect of missense variants 299
We used Ensembl variant effect predictor (VEP) to annotate missense variants using SIFT and 300 PolyPhen-2. All previously reported pathogenic variants were predicted deleterious/probably 301 damaging. Of the 15 variants in PRPF31 labelled 'uncertain significance' in ClinVar which are in the 302 exons, 5 were predicted tolerated/benign, 6 were predicted deleterious/probably damaging, 4 were 303 predicted deleterious/possibly damaging and 1 had conflicting predictions (deleterious/benign). 304 Of these 15 variants only 3 were predicted not to affect splicing, and so we concentrated on 305 characterizing these variants. These were Thr50Ile (deleterious/benign), Met212Val 306 (deleterious/probably damaging), Val433Ile (tolerated/benign). 307

3D structural analysis of missense variants in PRPF31 308
. CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint  ( Figure 1b) and Thr50Ile affects H bonding between PRPF31 and PRPF6 (Figure 1c), and we 322 predict that these variants affect protein folding and solubility, and are pathogenic. Val433Ile does 323 not affect H bonding within PRPF31, between PRPF31 and U4, nor does it affect polar contacts 324 (Figure 1d), so we predict this variant to be benign. 325 We took these variants, with one negative control and four positive control variants, three of which 326 have been proven to affect PRPF31 protein function in in vitro assays, forward for in vitro analysis. 327 lines. We achieved this using purified wild-type Cas9 and four single guide RNAs targeting intron 5 336 and exon 6 (coding exon 5) of PRPF31 which were modified to increase stability (Figure 2a). We 337 achieved up to 85% indel frequency, with up to 72% overall knockout efficiency. From the pool of 338 edited cells from sgRNA1 we used single cell sorting to isolate clones of PRPF31 +/cells with 339 heterozygous knockouts and wild-type unedited clones. We took three of each on for further analysis. 340 We confirmed insertion of T at the intron 5/exon 6 boundary of PRPF31 which causes a frameshift 341 and premature termination codon (Figure 2b). We performed whole transcriptome sequencing on 342 RNA from the nucleus (a mixture of completely and incompletely spliced transcripts) and cytoplasm 343

Production and characterisation of
(only completely spliced transcripts) from all 6 clones (SRA accession PRJNA622794). We analysed 344 predicted off-target changes in each clone through manual analysis of target regions in our RNAseq 345 data in IGV, via analysis of differential gene expression using the edgeR package (Robinson et al. . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.06.20055020 doi: medRxiv preprint 2014) (Supplementary Table 2). We found no evidence of sequence changes or expression changes 348 in any of the genes predicted to be off-target sites (with 3 mismatches) but found statistically 349 significant differential usage of 3 exons in MEGF6 between wild-type and mutant clones. Exons 25 350 (ENSE00001477187) and 24 (ENSE00001477188) of ENST00000356575.9, and exon 27 351 (ENSE00001308186) of ENST00000294599.8 are each significantly skipped in mutants, FDR p 352 value = 0.0279, 0.0343 and 0.0086 respectively) (Supplementary Table 2). MEGF6 is a poorly 353 characterised protein which has not been linked to cilia, and we do not expect this change to affect 354 our cell phenotype, but it is important to note this splicing variation in a gene which could potentially 355 be an off-target effect of our CRISPR guide RNAs. Analysis of splicing patterns of PRPF31 showed 356 no significant change in splicing of intron 5 or exon 6 in the mutant clones compare to wild-type (no 357 differential 3' splice site usage, skipping of exon 6 or retention of intron 5-6). However, we did 358 unexpectedly observe an increase of retention of intron 12-13 in the nuclear fraction of the mutant 359 cells (FDR p value = 0.0141 when considering only reads mapping splice junctions, or FDR p value 360 = 0.0091 when also considering reads mapping to the intron), although this was not observed in the 361 cytoplasmic fraction of the cells (Supplementary Figure 1). We hypothesise that mutant PRPF31 362 may experience changes in the dynamic of splicing, with less efficient removal of introns before 363 export from the nucleus. 364 Transcript level expression analysis of RNA sequence data showed expression of three PRPF31 365 transcripts in both mutant and wild-type cell lines; ENST00000419967.5, ENST00000391755.1 and 366 protein-coding ENST00000321030.8, with an approximately 50% reduction in all PRPF31 367 transcripts in the mutant clones (Figure 2c). Analysis of reads around the CRISPR insertion site (i.e. 368 at the intron 5/exon 6 boundary) in the mutant clones showed that very few reads contained the 369 insertion. In nuclear RNA from the mutant clones, the ratio of wild-type reads to reads with the 370 insertion was 46:2 (4.2% insertion), 92:11 (10.7% insertion) and 48:0 (0% insertion). Roughly the 371 same proportions of reads with insert were seen in the cytoplasmic RNA from mutant clones (70:2, 372 61:4, 53:2 ie 2.8%, 6.2%, 3.6%). This suggests that PRPF31 is preferentially expressed from the 373 wild-type allele in the mutant cells, and both wild-type and mutant transcripts are exported to the 374 cytoplasm. This suggests that in this cell model the disease phenotypes (see later) are caused by 375 haploinsufficiency. Indeed, western blotting of protein extracts from wild-type and mutant clones 376 confirmed reduction in PRPF31 protein levels in mutant cells compared to wild-type control cells 377 with no detectable expression of any mutant protein (Figure 2d). 378 As has been previously reported, mutation of PRPF31 is associated with reduction in the number and way, we used high-throughput imaging and automated image analysis to quantify number of cilia in 382 mutant cells compared to wild-type cells (Figure 3a). We also assayed a range of other phenotypes 383 which have been reported in PRPF31 mutants, including cell number, number of micronuclei per 384 cell, nuclear area, nuclear shape (compactness, eccentricity), nuclei staining intensity. Whilst these 385 assays showed a general trend in reduced cell number, increased number of micronuclei per cell and 386 . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.06.20055020 doi: medRxiv preprint reduced nuclear area in mutant clones compared to wild-type clones, the most robust and consistent 387 phenotype we observed was the loss of cilia phenotype in PRPF31 +/cells (Figure 3b). 388 particular missense mutation has a dominant negative effect on cells (Figure 4a). 420

Characterisation of PRPF31 missense variants using high-throughput imaging
Of the novel missenses being tested, PRPF31 c.149C>T p.Thr50Ile failed to fully rescue the 421 defective cilia phenotype, suggesting that this variant is pathogenic (Figure 4a). PRPF31 c.634A>G 422 p.Met212Val fully rescued the defective cilia phenotype, suggesting that this missense is benign 423 (Figure 4a). PRPF31 c.1297G>A p.Val433Ile partially rescued the loss of cilia phenotype, 424 suggesting that this may be a mildly pathogenic variant, similar to c.413C>A p.Thr138Lys, although 425 we would more confidently ascribe this benign status (Figure 4a). A study of cell number showed 426 that two of the missenses which showed the most severe impact on cilia (PRPF31 c.149C>T 427 p.Thr50Ile and c.581C>A p.Ala194Glu) also caused a reduction in cell number, although c.341C>A 428 p.Thr114Lys which is also a severe variant, does not (Figure 4b). However, overall there was no 429 clear correlation between severity of effect on cilia phenotype and effect on cell number. 430 To confirm expression of each construct we extracted protein from transfected cells and analysed 431 expression levels by western blotting. This showed that all constructs were expressed but, as 432 previously reported (Wheway et al, 2019), some missense mutated forms of PRPF31 (c.341T>A 433 p.Ile114Asn, c.581C>A p.Ala194Glu) were associated with reduced stability and solubility of the 434 protein, appearing as lower levels in the soluble fraction of cell extracts (Figure 4c). The variants 435 with the least soluble expression tended to be those with the most severe effect on cilia phenotype 436 (Figure 4c). 3D structural analysis predicted that c.149C>T pT50I would interfere with binding to 437 PRPF6. We did see a small decrease in total level of PRPF6 in cell transfected with this construct 438 (Figure 4c)  Here we present a high-throughput high content imaging assay providing quantitative measure of 451 effect of missense variants in the second most common cause of autosomal dominant RP, PRPF31. 452 Our screening assay meets the criteria for a baseline standard in vitro test for clinical variant 453 interpretation ) because the disease mechanism is understood (combined 454 . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.06.20055020 doi: medRxiv preprint haploinsufficiency/dominant negative effects), the assay is applicable to this disease and this disease 455 mechanism, normal/negative/wild-type (1) and abnormal/positive/null controls (4)  The imaging-based screen uses a simple and robust image analysis algorithm to test a consistent 479 cellular phenotype observed in PRPF31 mutant cells; reduction in the number of cells with a single 480 cilium. The assay provides a continuous data readout in the form of percentage of cells with a single 481 cilium, which has the potential to provide more than a simple binary readout of pathogenic/benign 482 but a measure of the extent of pathogenicity of each variant. The findings of this assay and other such 483 assays can also provide novel insights into disease mechanism and prognosis. effect on cilia will be associated with the earliest onset and worst prognosis. 490 Finally, our findings correlate to some extent with in silico predictions, although not perfectly. 491 PolyPhen2 predicted c.149C>T pT50I to be benign, whereas our in vitro assay suggests this is a 492 pathogenic variant. In this case, 3D structural analysis was additionally useful in predicting 493 . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint sequence of potential off-target mapping sites, with 3 mismatches allowed (mismatches shown in 516 lower case), chromosomal location of these potential off-target sites, whether they are on the + or -517 strand, number of mismatches, gene name and feature targeted. 518

Supplementary Table 2 519
Summary of analysis of potential off-target CRISPR cut sites, showing findings observed in IGV, 520 through differential gene expression analysis by edgeR, and differential splicing analysis by rMATS, 521 including alternative 3' splice site usage (A3SS), alternative 5' splice site usage (A5SS), mutually 522 exclusive exons (MXE), retained introns (RI) and spliced exons (SE). 523    Beta actin 42kDa . CC-BY 4.0 International license It is made available under a author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
is the (which was not peer-reviewed) The copyright holder for this preprint