Exemplar Scoring Identifies Genetically Separable Phenotypes of Lithium Responsive Bipolar Disorder =================================================================================================== * Abraham Nunes * William Stone * Raffaella Ardau * Anne Berghöfer * Alberto Bocchetta * Caterina Chillotti * Valeria Deiana * Franziska Degenhardt * Andreas J. Forstner * Julie S. Menzies * Eva Grof * Tomas Hajek * Mirko Manchia * Francis McMahon * Bruno Müller-Oerlinghausen * Markus M. Nöthen * Marco Pinna * Claudia Pisanu * Claire O’Donovan * Marcella D.C. Rietschel * Guy Rouleau * Thomas Schulze * Giovanni Severino * Claire M Slaney * Alessio Squassina * Aleksandra Suwalska * Gustavo Turecki * Petr Zvolsky * Pablo Cervantes * Maria del Zompo * Paul Grof * Janusz Rybakowski * Leonardo Tondo * Thomas Trappenberg * Martin Alda ## Abstract Predicting lithium response (LiR) in bipolar disorder (BD) could expedite effective pharmacotherapy, but phenotypic heterogeneity of bipolar disorder has complicated the search for genomic markers. We thus sought to determine whether patients with “exemplary phenotypes”—those whose clinical features are reliably predictive of LiR and non-response (LiNR)—are more genetically separable than those with less exemplary phenotypes. We applied machine learning methods to clinical data collected from people with BD (n=1266 across 7 international centres; 34.7% responders) to compute an “exemplar score,” which identified a subset of subjects whose clinical phenotypes were most robustly predictive of LiR/LiNR. For subjects whose genotypes were available (n=321), we evaluated whether responders/non-responders with exemplary phenotypes could be more accurately classified based on genetic data than those with non-exemplary phenotypes. We showed that the best LiR exemplars had later illness onset, completely episodic clinical course, absence of rapid cycling and psychosis, and few psychiatric comorbidities. The best exemplars of LiR and LiNR were genetically separable with an area under the receiver operating characteristic curve of 0.88 (IQR [0.83, 0.98]), compared to 0.66 [0.61, 0.80] (p=0.0032) among the poor exemplars. Variants in the Alzheimer’s amyloid secretase pathway, along with G-protein coupled receptor, muscarinic acetylcholine, and histamine H1R signaling pathways were particularly informative predictors. In sum, the most reliably predictive clinical features of LiR and LiNR patients correspond to previously well-characterized phenotypic spectra whose genomic profiles are relatively distinct. Future work must enlarge the sample for genomic classification and include prediction of response to other mood stabilizers. Keywords * Bipolar disorder * lithium response * machine learning * genomics * representational Rényi heterogeneity ## 1. Introduction Bipolar disorder (BD) is a severe lifelong illness characterized by recurrent manias, depressions, and a relatively high suicide risk [1,2]. Mood stabilizer initiation occurs approximately a decade after symptom onset, on average [3], and the trial-and-error process of pharmacological optimization for BD may lengthen this time. However, by predicting individuals’ mood-stabilizer response, this burden of untreated illness may be reduced. Clinical data are currently the best lithium response predictors. Responders often have a completely episodic course with full inter-episode remissions, absence of rapid cycling, and family history of fully remitting BD (particularly the lithium responsive type) in a first degree relative [4,5]. This has motivated the search for strong genomic predictors of lithium response, but they remain elusive [6]. In large multi-site studies, lithium responder and non-responder groups may be too heterogeneous to classify robustly. However, it is possible that within this pooled group of heterogeneous subjects there exist more distinct “exemplars” of each phenotype, whose clinical profiles are consistent across sites, and who may be genomically more distinct. Our paper is thus motivated by two questions. First, can clinical presentation identify exemplars of lithium response and non-response? Second, are clinical exemplars of lithium response and non-response more genetically separable than their less exemplary counterparts? Using the largest clinical database on lithium treatment in BD, we developed a method for rating the degree to which a subject is an exemplar of lithium response or non-response, respectively (an exemplar score). We hypothesized that the clinical differences between the best exemplars of lithium response and non-response would be reflective of factors previously associated with the “classical” bipolar phenotype. Finally, on a subset of subjects who were genotyped, we hypothesized that clinically exemplary responders and non-responders would be more accurately separable by application of a machine learning (ML) classifier to their genomic data (compared to their counterparts with low exemplar scores). ## 2. Methods Clinical and genetic data were collected in the context of protocols approved by the Ethics Committee of the former Health Agency of Cagliari (now University Hospital Health Agency of Cagliari) for the Cagliari (University) and Centro Bini samples, and the research ethics boards of the Nova Scotia Health Authority, the McGill University Health Centre, the Royal Ottawa Hospital, and the University of Poznan. Our analysis is split into two parts. In Part 1, we use a multi-centre database of clinical variables in order to derive a score that identifies subjects whose clinical phenotypes reliably predict lithium response/non-response. Part 2 uses a separate set of genomic data collected from a subset of subjects included in the clinical data from Part 1. In Part 2, we compare the ability to classify lithium response using those genetic data when they are stratified according to subjects’ *clinical* exemplar scores. ### 2.1. Part 1: Scoring and Characterization of Clinical Exemplars #### 2.1.1. Data Collection Clinical data collection procedures were described in Nunes et al. [7]. Data consisted of 180 variables recorded prior to instituting lithium maintenance therapy in 1266 people with BD across 7 sites internationally (Table 1). Response was evaluated after a minimum treatment duration of 1 year. Lithium response was defined as a score of ≥ 7 on the previously validated Alda scale [8]. View this table: [Table 1.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T1) Table 1. Description of constituent datasets. *Abbreviations:* number of patients (N), lithium responders (LR+), Cagliari (Centro Bini; CB), Cagliari (University; CU), International Group for the Study of Lithium (IGSLi), Maritimes (MAR), Ontario (ON), Poznan (POZ). #### 2.1.2. Exemplar Scoring Based on Clinical Predictors Subjects who are most exemplary of their clinical phenotype should be classified accurately by models trained on data from any given site. Our overall exemplar scoring protocol thus involves (1) obtaining out-of-sample predictions of every subject’s class based on models trained on each individual site’s data, then (2) summarizing accuracy and level of agreement with which each subject was classified into a single value known as the exemplar score (Figure 1). ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F1) Figure 1. Hypothetical illustration of the clinical exemplar scoring analysis. Note that this part of the analysis is performed using the clinical feature dataset alone. **Panel A**: Demonstration of heterogeneity in the relationship between lithium responsiveness (depicted as “Li(+)” for responders and “Li(-)” for non-responders) and clinical features across four hypothetical sites. A classifier trained on data from each individual site may yield different discriminative functions. **Panel B**: Points demonstrate the aggregated dataset (“+” and are responders and non-responders, respectively). Contours demonstrate regions of clinical feature space in which site-level classifiers (from Panel A) agree with high accuracy on the predicted class. An exemplar score can be computed for each subject in the clinical dataset by (1) holding his data out of the training set, (2) predicting his lithium responsiveness using site-level classifiers trained on the remaining subjects, then (3) using the site-wise prediction results to compute the exemplar score. **Panel C**: Stratification of the clinical dataset according to lithium responsiveness and exemplar score quartile. The “LRBest” and “NRBest” exemplars are those responders and non-responders with exemplar scores above the 75th percentile, respectively. The “LRPoor” and “NRPoor” exemplars are those responders and non-responders with exemplar scores below the 25th percentile, respectively. This stratification can be used to evaluate the clinical features that differentiate good from poor exemplars of lithium response and non-response, respectively. #### 2.1.3. The Clinical Exemplar Score Let ![Graphic][1] denote phenotypic data from subject *i ∊* {1, 2,…, *nj*}, where *xij* is a vector of clinical features, *yij* ∊ {0, 1} denotes whether the patient is a lithium responder, and *nj* is the number of patients in the sample from site *j* ∊ {1,2,…, *S*}. A pair (x, *y*) can thus be viewed as a set of coordinates on the (observable) phenotypic space ![Graphic][2]. Data are sampled from *S* sites, each of which can be considered to sample a subdomain of the phenotypic space ![Graphic][3]. These site-wise subdomains are not necessarily disjoint. Indeed, if they were disjoint, the sites’ data would share nothing in common. Now let ![Graphic][4] denote a classifier learned on training data from site *j*. Given a new set of clinical features, x′, the classifier predicts the probability that the corresponding patient is a lithium responder: that is, ![Graphic][5]. We denote the accuracy score of this prediction as ![Formula][6] The representational Rényi heterogeneity measurement approach [11] consists of measuring heterogeneity on a latent or transformed space onto which observable data are mapped. To apply this in the present case, where we have defined our observable space, ![Graphic][7], we must now devise an appropriate transformed space upon which the Rényi heterogeneity will be both meaningful and tractable. The heterogeneity deemed relevant in the present case arises in terms of differences in classification models across sites. Most starkly, we noted that the informative features for lithium response prediction varied between the best performing sites. In other words, depending on which site’s data are used for training, one might learn quite different (and perhaps even contradictory) relationships between clinical features and lithium responsiveness. In the limit where data from each site encodes completely different relationships between clinical features and lithium response, then each classifier ![Graphic][8] will behave distinctly (they will tend to disagree). In terms of numbers equivalent, we would say that in such a case there is an effective number of S distinct classifiers. Conversely, if the phenotypic domains of all sites overlap completely, then all classifiers ![Graphic][9] will tend to make similar predictions, which would correspond to an effective number of one classifier. Let the accuracy of classifier ![Graphic][10] in predicting the relationship x → *y* be a measure of that model’s informativeness at point (x, *y*). We can thus define ![Graphic][11] as a categorical space representing an index on “the most informative classifier.” We illustrate the mapping ![Graphic][12] in Figure 2. A probability distribution over ![Graphic][13] can be computed using a normalization of Equation 1: ![Formula][14] ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F2) Figure 2. Representation of the mapping from phenotypic space ![Graphic][15] onto the representation of “most informative site-level model” (![Graphic][16]). The transformation function is the normalized accuracy score for a classification model trained on each site’s data individually (Equation 2). The quantity *fj* (x, *y*) can be taken to represent the probability that a classifier trained on data from site *j* is the most informative about the x → *y* mapping in that particular region of ![Graphic][17]. With this, we can compute the representational Rényi heterogeneity at (x, *y*) as follows: ![Formula][18] If the models ![Graphic][19] differ only in their training data (i.e. they have the same architecture, optimization routine, and hyperparameters) then the units of Equation 3 are “the effective number of informative sites.” Recall that we defined a “clinical exemplar” as a subject whose phenotype (x, *y*) is reliably predicted accurately across all sites. In other words, regardless of the differences between sites’ data, all sites would agree in their predictions of the exemplars’ phenotypes. More formally, clinical exemplars must have high values of Π*q* (x, *y*) (all sites are similarly informative). However, to identify more specifically the exemplars of lithium response and non-response, we cannot solely rely on Π*q* (x, *y*), since that value may be high, despite sites’ prediction accuracies being low. Let ![Graphic][20] denote the maximal accuracy score obtained in classification at (x, *y*). We take this value to represent the degree to which a subject with that phenotype can be clearly associated with one class or another. An interesting case occurs where both *t** and Π*q* (x, *y*) are high, suggesting the point (x, *y*) is an exemplar of the regions of ![Graphic][21] that are reliably well classified across sites. Conversely, if *t** ≈ 0.5 and Π*q* (x, *y*) is high, then that point is exemplary of a region of ![Graphic][22] of which all sites are uncertain. When *t** is low and Π*q* (x, *y*) is high, then (x, *y*) is exemplary of a region of ![Graphic][23] that reliably misleads all sites’ classifiers. In the present study, we are concerned with identifying only those subjects with high values of both *t** and Π*q* (x, *y*), since they exemplify the most canonical “phenotypes” of lithium response and non-response, respectively. We accomplish this by combining t* and Π*q* (x, *y*) into a single index we call the *exemplar score*. The exemplar score at coordinate (x, *y*) of the phenotypic space is defined as ![Formula][24] where ![Graphic][25] is a standardization of the Rényi heterogeneity to the [0,1] interval (the same scale as *t**): ![Formula][26] In the present study, we define the “best exemplars” as subjects whose exemplar scores (within their lithium response classes) were in the top 25%. Poor exemplars were those subjects whose phenotypes were in the lower quartile of exemplar scores within their response classes. #### 2.1.4. The Predict Every Subject Out (PESO) Protocol The predict every subject out (PESO) protocol is a method by which we can compute exemplar scores for each subject in the dataset while (A) ensuring that subject is not included in the training data and (B) having each model train on only that site’s data. All classifiers in our data were random forests, (RFC) [12] under the same specifications as in Nunes et al. [7] (100 estimators; SciKit Learn implementation; [13]). Similar to that study, missing data were marginalized by sampling from uninformative priors on respective variables’ domains [7]. A schematic of the protocol is shown in Figure 3. For each site in the clinical predictors dataset, the PESO analysis protocol begins with a Leave-One-Out cross-validation run to obtain out-of-sample predictions for each of that site’s constituent subjects. We then train an RFC on that site’s data and predict lithium response in all other sites’ subjects. Each subject is thus mapped onto our categorical space ![Graphic][27]T, upon which we can measure their exemplar scores. ![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F3) Figure 3. Illustration of the algorithm for the predict every subject out protocol. #### 2.1.5. Comparison of Clinical Characteristics of the Best and Worst Exemplars Univariate clinical feature differences were compared between the best exemplars of lithium response and non-response (“LRBest” and “NRBest,” respectively; the upper exemplar score quartile per class), and the corresponding poor exemplars (“LRPoor” and “NRPoor,” respectively; the lower exemplar score quartile per class). Continuous variables were compared using the two-sample permutation test of independence and categorical variables were compared using the randomization chi-square test (with 10,000 replications owing to multiple comparison corrections). The significance threshold was adjusted for 116 comparisons: *αC =* 0.05/116 = 0.0004. ### 2.2. Part 2: Biological Validation hrough Genomic Classification Figure 4 illustrates Part 2 of the present study, wherein we compare the genetic prediction of lithium response between subjects whose clinical profiles are exemplary and non-exemplary, respectively. After comparing genomic classification performance between the “Best” and “Poor” exemplar strata, respectively, we submit the genomic classifiers’ coefficients to gene enrichment analysis. This part of our study uses genomic data from subjects in the Consortium on Lithium Genetics GWAS cohort [6] who also had detailed clinical information collected for Part 1 of the present study. ![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F4) Figure 4. Hypothetical illustration of Part 2 of this study’s analysis, which evaluates the degree to which stratification of genomic data by corresponding subjects’ *clinical* exemplar scores can improve genomic classification performance. **Panel A**: Subjects’ genotypes lie on a genotypic feature space (shown in Panel A1 as a simplified 2 dimensional plane). Panel A2 shows a hypothetical ROC curve for these aggregated data. **Panel B:** Each genotyped subject has an exemplar score computed from Part 1 of the present study. Recall that the exemplar score merely identifies the degree to which a subject’s clinical profile (i.e. symptoms, family history, comorbidities, etc.) is reliably predictive of lithium responsiveness. Panel B1 shows that the aggregated genotyped sample can then be stratified into the “Best” clinical exemplars (subjects with top 25% of clinical exemplar scores within each of the responder and non-responder groups, respectively), and the “Poor” clinical exemplars (those with the lowest 25% of clinical exemplar scores in each responsiveness class). We then apply classifiers to the genomic data in each of these “Best Exemplar” and “Poor Exemplar” strata, respectively, and compare classification performance (Panel B2). The hypothetical receiver operating characteristic curve in Panel B2 reflects our hypothesis, that genetic classification of lithium response will be superior among the subgroup of Best clinical exemplars. #### 2.2.1. Data Collection Genomic data, obtained as part of the ConLiGen GWAS [6], were available for 321 of the subjects whose clinical data were analyzed in Part 1 of our study. In the Supplementary Materials, we show that there was no population stratification in this subsample, particularly in comparison to the broader ConLiGen sample. We restricted the data to only the 47,465 SNPs for which complete data were available across all ConLiGen sites. Preprocessing and quality control were done according to the Hou et al. [6] protocol. #### 2.2.2. Genomic Classification Analysis For genotyped subjects, we compared the performance of a classifier applied to (A) all 321 subject’s genomic data, (B) the worst exemplars’ genomic data, and (C) the best exemplars’ genomic data. We employed L2-penalized logistic regression (C=1 set *a priori*). Model criticism was performed under stratified-10-fold cross-validation. Our primary outcome was the average cross-validated Matthews correlation coefficient (MCC), which is conservative under class imbalance. Classification performance differences were compared between conditions using the Kruskal-Wallis test. Where a statistically significant difference was observed (at *α* = 0.05), pairwise comparisons were done with the Mann-Whitney U tests (at threshold *αC* = 0.05/3 = 0.017). We secondarily report accuracy, area under the receiver operating characteristic curve (ROC-AUC), Cohen’s kappa, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). In the model trained on the best exemplars, we indexed variants whose logistic regression coefficients agreed in sign across all cross-validation folds, then applied a statistical enrichment test to the nearest associated genes using the PANTHER classification system v. 14.1 [14]. To evaluate the relationship between exemplar strata and enriched pathways, we repeated this analysis using logistic regression coefficients from the poor exemplar group. The threshold for statistical significance was set at *αFDR* = 0.05, where FDR indicates correction for false discovery rate. Further gene set analysis details are provided in Appendix B. ## 3. Results ### 3.1. Part 1: Scoring and Characterization of Clinical Exemplars #### 3.1.1. Accuracy Distributions in the Predict Every Subject Out Analysis A classifier trained on data from the Maritimes site achieved the highest mean overall accuracy (0.59, 95% confidence interval, CI, [0.58, 0.6]; Figure 5), which appeared largely driven by that site’s ability to accurately classify its own subjects (0.69 [0.66, 0.71]), and those from Montreal (0.71 [0.67, 0.75]). However, Figure 5 shows that site-level models’ accuracy distributions were highly variable in shape and modality, suggesting heterogeneous classification behaviour between sites. ![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F5) Figure 5. Accuracy distributions for models evaluated under the predict every subject out (PESO) regime. The violin plot at the upper leftmost corner shows the accuracy distributions for each site model evaluated over all subjects in the dataset, with the densities colored according to the proportion of lithium responders in the training site’s data. The remaining subplots show accuracy histograms for training site models (specified in the titles) stratified across out-of-sample sites. For the site-wise histograms, color indicates the responder/non-responder balance in the respective validation site. *Abbreviations*: Lithium responder (LR+), Cagliari (Centro Bini; CB), Cagliari (University; CU), International Group for the Study of Lithium (IGSLi), Maritimes (MAR), Ontario (ON), Poznan (POZ). #### 3.1.2. Characteristics of the Best and Poor Exemplars Within the clinical dataset of Part 1, there were 110 individuals in LRBest and LRPoor groups, and 207 individuals in the NRBest and NRPoor groups (Table 2). The LRBest group came predominantly from IGSLi (53.6%) and Ontario (21.8%), and most NRBest subjects were from Maritimes (72.5%) and Montreal (25.1%). The LRBest group showed a later age of onset (median 28y, interquartile range, IQR [21, 36]) compared to NRBest (median 19, IQR [16, 24]; p<0.00001). View this table: [Table 2.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T2) Table 2. Clinical characteristics of exemplars, by lithium responsiveness. Characteristics of the best (upper 25% of exemplar scores) and poor (lower 25% of exemplar scores) exemplars of lithium response (LiR) and non-response (LiNR), respectively. Categorical data are presented as count (%), whereas normally distributed continuous variables are presented as mean (standard deviation), and non-normal continuous variables are presented as median [interquartile range]. Abbreviations: Calgiari (University; CU), Cagliari (Centro Bini; CB), International Group for the Study of Lithium (IGSLi), Maritimes (MAR), Montreal (MTL), Ontario (ON), Poznan (POZ), bipolar disorder (BD), major depressive disorder (MDD), antidepressants (AD), schizoaffective disorder (SZA), global assessment of functioning (GAF), lithium (Li), suicide attempts (SA), first degree relatives (FDR), second degree relatives (SDR), schizophrenia (SCZ), suicidal ideation (SI), history (Hx), generalized anxiety disorder (GAD), obsessive compulsive disorder (OCD), attention deficit hyperactivity disorder (ADHD), hypertension (HTN), socioeconomic status (SES). The LRBest subjects for whom clinical course information was available all showed a completely episodic course, whereas NRBest courses were mainly chronic fluctuating (43.5%) and episodic with residual symptoms (44.9%). These differences were statistically significant at the omnibus level (p=0.0001). Interestingly, differences in clinical course between LRPoor and NRPoor were opposite in direction to those observed among best exemplars. NRPoor subjects had predominantly completely episodic clinical courses (74.1%), whereas LRPoor subjects exhibited predominantly chronic fluctuating (41%) and chronic (25.3%) courses, with only 18.1% being completely episodic (omnibus p=0.0001). The complete absence of rapid cycling was reported in 98.3% of LRBest, and in only 47.2% of NRBest (p=0.0001). The remaining majority of the NRBest subjects (49.2%) reported having experienced spontaneous rapid cycling. The occurrence of rapid cycling was no different between LRPoor and NRPoor groups. The occurrence of lifetime psychosis differed between LRBest and NRBest, with a total of 42.8% of the non-responders reporting episodic and mood congruent psychosis (compared to only 16.7% of responders; p=0.0001). Non-responders also reported incongruent episodic psychosis in 18.6% of cases, with only 37.1% of non-responders reporting an absence of psychosis altogether. In contrast, 83.3% of the best exemplars of lithium response reported a complete absence of lifetime psychosis. The LRBest group had a lower rate of panic disorder (2.1% vs. 27.9%; p=0.0001), generalized anxiety disorder (3.6% vs 41.2%; p=0.00025), and substance abuse (2% vs. 37.9%; p=0.0001) than NRBest. There was also a general trend toward lower rates of psychiatric comorbidity in LRBest compared to the NRBest group. Social anxiety disorder was present in 0% of lithium responders but 27.9% of non-responders (p=0.0007). Responders also had relatively lower rates of obsessive-compulsive disorder (2.1%) compared to non-responders (14.1%; p=0.0025). These findings were largely reversed when looking at the poor exemplars. LRPoor subjects had higher rates of social anxiety disorder (35.6% vs 4.5%; p=0.0001), panic disorder (48.9% vs 15.5%; p=0.0001), generalized anxiety disorder (52% vs 7.3%; p=0.0001), substance abuse (41.5% vs 21.0%; p=0.0005), attention deficit hyperactivity disorder (60.8% vs. 10.6%; p=0.0001), learning disability (51.4% vs 10.6%; p=0.0001), and personality disorder (31.9% vs 3.4%; p=0.0001) compared to the NRPoor subjects. ### 3.2. Part 2: Biological Validation through Genomic Classification Recall that the genomic data for this element of the analysis are derived from a single site in the ConLiGen data. In Appendix C, we demonstrate relative lack of genomic population stratification in this subset, with a comparison to the broader ConLiGen sample. #### 3.2.1. Genomic Classification among the Best and Poor Exemplars Genotyped subjects overlapped with clinical data from the Maritimes (n=129; 40%), Montreal (n=74; 23%), Ontario (n=62; 19%), and IGSLi (n=56; 17%), although in the ConLiGen GWAS [6], they were all classified as from the Maritimes (Dalhousie University). Most clinical differences reflect those reported in Section 3.1.2 and thus are reported in Table D.1. Genomic classification results are presented in Figure 6 and in tabular fashion in Table D.2. The median MCC for classification of the Best exemplars was 0.58 (IQR [0.41, 0.77]), which was greater than classification analyses with either the poor exemplars (0.29 [0.06, 0.5]; p=0.0043), or the entire dataset (0.32 [0.2, 0.44]; p=0.002). The ROC-AUC for classification of lithium response in the Best exemplars was 0.88 [0.83, 0.98], which was greater than that of the model trained only on poor exemplars (0.66 [0.61, 0.80]; p=0.0032) or the whole dataset (0.7 [0.62, 0.75]; p=0.001). ![Figure 6.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F6.medium.gif) [Figure 6.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F6) Figure 6. Genomic classification results. Results of classifying lithium response based on the genomic data of all subjects (“ALL”; n=321), the poor exemplars (<25th percentile of exemplar score; n=81), and the best exemplars (>75th percentile of exemplar score; n=79). Boxes are defined by the interquartile range (IQR), with the median shown as the black centered line. Whiskers are 1.5 times the IQR. Each panel shows the results for a different classification performance metric. *Abbreviations:* Matthews’ correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC), Cohen’s kappa (Kappa), positive predictive value (PPV), negative predictive value (NPV). Figure 7 shows pathway analysis results for the best exemplars. Enriched pathways involved (A) muscarinic acetylcholine receptor types 1 and 3 signaling (mAChR1/3; 27 genes, false discovery rate FDR=0.017), (B) Alzheimer disease-amyloid secretase (30 genes, FDR=0.034), (C) heterotrimeric G-protein coupled receptor Gq/Go α signaling (GPCRq/o-α; 53 genes, FDR=0.04), and (D) histamine H1R mediated signaling (H1R; 27 genes, FDR=0.039). Complete gene set analysis results are shown in Table D.3. Enrichment studies in the gene ontology “cellular component” and “biological function” categories are shown in Tables D.4 and D.5. ![Figure 7.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F7.medium.gif) [Figure 7.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F7) Figure 7. Gene enrichment in the best exemplars. Results of the statistical enrichment test using the logistic regression coefficients from the classifier trained on the best exemplar. Individual genes are shown in gray, with pathway nodes (and edges) colored according to the pathway identity. Pathway names are shown in bold along the perimeter of the graph. *Abbreviations:* acetylcholine receptor (AChR), G-protein coupled receptor (GPCR), histamine H1 receptor (H1R), false discovery rate (FDR). ## 4. Discussion Individuals who are most phenotypically representative of lithium response and non-response may be more genetically distinct than their less exemplary counterparts, particularly in genes related to GPCRq/o-α, mAChR1/3 or H1R signaling, and the Alzheimer’s amyloid-secretase pathway. Exemplars also showed distinct clinical profiles that are consistent with past phenotypic research on lithium responders. Since clinical exemplars are more genetically separable, our study confers a measure of biological validity upon the practice of detailed clinical evaluation, whose predictive utility we have previously demonstrated [7]. One of our most important findings was characterization of the LRBest group as individuals with (A) a predominantly completely episodic clinical course, (B) low levels of psychiatric comorbidity, (C) later age of onset, (D) a general absence of rapid cycling, and (E) either absence of psychosis or limitation to mood congruent intra-episodic form. The first two findings are likely the strongest since we observe the opposite pattern among the LRPoor and NRPoor groups. Notwithstanding, all of these elements support past evidence on the clinical phenotype of lithium responsive bipolar disorder. For instance, Passmore et al. [15] found that lithium responders generally had a more episodic course of illness, whereas lamotrigine responders were more likely to have experienced rapid cycling, a higher rate of psychiatric comorbidity, and an earlier age of onset. A later age of onset in lithium responders has been demonstrated in meta-analysis [16,17]. Absence of rapid cycling has also been associated with good lithium response by Backlund et al [18] and Tondo et al. [19]. Finally, Kleindienst & Greil [20] found that carbamazepine responders were more likely to have had mood incongruent psychosis than lithium responders, while the updated meta-analysis by Hui et al. found an association between absence of psychotic symptoms and lithium responsiveness [17]. Aside from not including family history related variables (potentially an artifact of related variable definitions), the clinical picture of the exemplary lithium responder that emerges from our study largely aligns with that noted by several authors, such as Grof [4], Gershon & Malhi [21], and Alda [22]. Recently, Kendler [23] reminded us that the utility of biological tests, such as the electrocardiograms and troponin assays used to detect myocardial infarction, is generally contingent upon the clinician’s identification of candidate patients whose presentations are clinically consistent with the illness being targeted. The present study, which shows that refinement of a clinical sample into those whose phenotypes are clinically most exemplary of the target syndrome, provides strong data-driven support for Kendler’s statement. Further still, we have noted that the clinical picture of the exemplary lithium responders (and non-responders) has been hypothesized for some time, and our study now provides biological support for the predictive validity of these phenotypic hypotheses. Specifically, we were able to genomically classify the best clinical exemplars of lithium response and non-response with a ROC-AUC of 0.88 (IQR [0.83,0.98]), whereas poor exemplars of these classes could only be discriminated with a ROC-AUC of 0.66 (IQR [0.61,0.80]; p=0.0032). If there was no biologically mediated information in the exemplary phenotype of lithium response (and non-response), then this difference would not have been observed. Variants most informative in discrimination of the best exemplars showed enrichment of genes involved in the heterotrimeric GPCRq/o-*α*, mAChR1/3 or H1R signaling, and the Alzheimer’s amyloid-secretase pathway. Lithium response and BD have long been associated with GPCR signaling [24]. In particular, lithium may affect signaling in both the Go-alpha pathway (at least via adenylate cyclase) and the Gq-alpha pathway (via effects on 1,4,5-triphosphate and protein kinase C, PKC) [25–29]. Interestingly, our results imply that differences in GPCR signaling may be segregated according to medication responsiveness. Enrichment in the Alzheimer’s amyloid-secretase pathway is interesting given the growing interest in the effects of lithium on Alzheimer’s pathology. Alterations in cholinergic and histaminergic systems have figured less prominently in the biological literature on BD and lithium response. However, note that Figure 7 shows that many genes enriched in the cholinergic and histaminergic systems were also enriched in the GPCR and Alzheimer’s amyloid pathways (which comparatively have more individual genetic associations). It is possible that alterations in the cholinergic and histaminergic systems may be subcomponents of the broader differences in the GPCR and Alzheimer’s amyloid systems. In future work, it would be of interest to characterize a more fine-grained “gradient” of genetic differences across the spectrum of exemplar scores, and to further evaluate the significance of cholinergic and histaminergic system enrichment in our study. One limitation of our study includes the relatively low sample size for the genomic analysis. Future work could endeavor to obtain further genotypic information for individuals in our clinical database, or detailed clinical information for individuals in our genomic database. As features, our study also only used those SNPs that overlapped across genotyping platforms in the ConLiGen dataset. Unfortunately, however, the number of fully imputed variants was on the order of millions, which would be analytically intractable in the present context. Filtering-based feature selection approaches in our present study would be (A) too computationally expensive across these millions of variants and (B) require much larger sample sizes since they must be repeated within each training partition. We also had no dominant a priori biological rationale for limiting the data to a restricted subset, since, as our results later confirmed, these biological systems may differ between exemplar strata. Ultimately, we chose the set of completely genotyped SNPs that overlapped across ConLiGen sites in order to facilitate the potential conceptual generalizability of our pathway analysis results, in particular. That is, since the pathways detected were based on variants that are broadly genotyped, these results could potentially be extended to other ConLiGen sites, should the corresponding clinical variables become available. Our study is also limited by its focus on lithium response, at the exclusion of other mood stabilizers. It is therefore possible our lithium responders are simply those with a more generally responsive form of BD. The only way to prove specificity would be to obtain data showing a single subject’s non-response to other mood stabilizers and response to lithium. That being said, there is evidence that excellent response to lithium may be exclusive to that medication [29]. After further validity checks on larger samples of genomic data in lithium responders and non-responders, it will be of great interest to examine exemplar-based genomic classification of mood stabilizer response more broadly. Such work could potentially advance the development of joint clinical-biological prediction models for mood stabilizer response. ## Data Availability Data requests may be provided to the authors. ## Funding Genome Canada (MA, AN), Dalhousie Department of Psychiatry Research Fund (MA, AN), Dalhousie Medical Research Foundation and the Lindsay Family (MA, AN), Canadian Institutes of Health Research #418254 (MA, AN), Nova Scotia Health Research Foundation Scotia Scholars Graduate Scholarship (AN), Killam Postgraduate Scholarship (AN), EMBED-BMBF-01EW1904 (MDCR). ## Appendix B Gene Set Analysis At each fold of cross-validation (under all settings of q), the logistic regression coefficients were saved. The SNPs whose logistic regression coefficients were of the same sign (i.e. positive or negative) across all folds were ranked in terms of their absolute median coefficient values and linked to gene identifiers using the NCBI gene database. Each gene was assigned the maximal absolute value of the logistic regression coefficients for all SNPs tagged by that gene; the remainder (duplicates) were deleted, such that each included gene had only one numerical value associated with it. We then applied the statistical enrichment test in the PANTHER classification system v. 14.1 [14]. We repeated the statistical enrichment test for the following annotation sets: PANTHER pathways, GO molecular function (complete), GO biological processes (complete), GO cellular components (complete). To further evaluate the degree to which the enrichment analyses speak specifically to findings among the best exemplars, we repeated the same procedures outlined here using the logistic regression coefficients for the poor exemplars. ## Appendix C Population Stratification To evaluate for the presence of population stratification in our genomic sample, we plot the first several principal components of the subjects’ genotypes in Figure C.1. For comparison, Figure C.2 demonstrates the first several principal components from 14 sites of the full Consortium on Lithium Genetics (ConLiGen) genomic sample. ![Figure C.1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F8.medium.gif) [Figure C.1.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F8) Figure C.1. Principal components analysis of the genomic dataset from Halifax (as coded in the ConLiGen studies [6]). The left column is coloured by the site of origin, whereas the right column of plots is coloured by lithium responsiveness. *Abbreviations:* International Group for the Study of Lithium (IGSLi), Maritimes (MAR), Montreal (MTL), Ontario (ON; also known as Ottawa/Hamilton). ![Figure C.2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/05/22/2020.05.19.20098566/F9.medium.gif) [Figure C.2.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/F9) Figure C.2. Principal components analysis of the genomic dataset from the Consortium on Lithium Genetics sample [6] (Figure used with permission from Stone et al., submitted manuscript) ## Appendix D Supplementary Tables Clinical demographic comparisons between the best exemplars, poor exemplars, and the aggregated sample of genotyped patients is presented in Table D.1, with stratification by lithium response. The results of gene enrichment analysis are presented in Table D.3, with specific genes enriched in the best exemplar group (related to glutamate receptors and signalling processes) shown in Tables D.4 and D.5. View this table: [Table D.1.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T3) Table D.1. Demographic comparisons for subjects whose genomic data (from the Consortium for Lithium Genetics; ConLiGen) overlapped with our clinical dataset. Comparisons were done in between lithium responders (LR(+)) and non-responders (LR(-)) for the total group (“ALL), the best exemplars (“Best; exemplar score ≥ 75th percentile), and the poorest exemplars (“Poor; exemplar score ≤ 25th percentile). View this table: [Table D.2.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T4) Table D.2. Results of classifying lithium response based on the genomic data of all subjects (ALL; n=321), the poor exemplars (<25th percentile of exemplar score; n=81), and the best exemplars (>75th percentile of exemplar score; n=79). Each panel shows the results for a different classification performance metric. Classification was done using logistic regression with an L2 penalty (regularization weight set to C=1 a priori) with stratification done over each value of the resolution parameter q=1 and q=2. *Abbreviations*: accuracy (Acc), area under the receiver operating characteristic curve (AUC), sensitivity (Sens), specificity (Spec), Cohen’s kappa (Kappa), Matthews’ correlation coefficient (MCC), positive predictive value (PPV), negative predictive value (NPV). Results are presented as means and 95% confidence intervals. View this table: [Table D.3.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T5) Table D.3. Results of gene enrichment analysis using the PANTHER gene ontology system. Analyses are presented for (A) pathways, (B) gene ontology cellular components, and (C) gene ontology biological processes. *Abbreviations*: false discovery rate (FDR). View this table: [Table D.4.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T6) Table D.4. Genes enriched in the best exemplars group related to glutamatergic synapses (gene ontology “cellular component” category). View this table: [Table D.5.](http://medrxiv.org/content/early/2020/05/22/2020.05.19.20098566/T7) Table D.5. Genes enriched among the best exemplars in the gene ontology “biological process” category of the glutamate receptor signaling pathway. * Received May 19, 2020. * Revision received May 19, 2020. * Accepted May 22, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. 1.Chesney, E.; Goodwin, G.; Fazel, S. Risks of all-cause and suicide mortality in mental disorders: A meta-review. World Psychiatry 2014, 13, 153–160. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/wps.20128&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24890068&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000337588300017&link_type=ISI) 2. 2.Manchia, M.; Hajek, T.; O’Donovan, C.; Deiana, V.; Chillotti, C.; Ruzickova, M.; Del Zompo, M.; Alda, M. Genetic risk of suicidal behavior in bipolar spectrum disorder: analysis of 737 pedigrees. Bipolar disorders 2013,15, 496–506. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/bdi.12088&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23734877&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) 3. 3.Drancourt, N.; Etain, B.; Lajnef, M.; Henry, C.; Raust, A.; Cochet, B.; Mathieu, F.; Gard, S.; Mbailara, K.; Zanouy, L.; Kahn, J.; Cohen, R.; Wajsbrot-Elgrabli, O.; Leboyer, M.; Scott, J.; Bellivier, F. Duration of untreated bipolar disorder: Missed opportunities on the long road to optimal treatment. Acta Psychiatrica Scandinavica 2013, 127, 136–144. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1600-0447.2012.01917.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22901015&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000313492300006&link_type=ISI) 4. 4.Grof, P. Responders to long-term lithium treatment. In Lithium in Neuropsychiatry: The Comprehensive Guide; Bauer, M.; Grof, P.; Muller-Oerlinghausen, B., Eds.; Informa Healthcare: UK, 2006; pp. 157–178. 5. 5.Grof, P.; Duffy, A.; Cavazzoni, P.; Grof, E.; Garnham, J.; MacDougall, M.; O’Donovan, C.; Alda, M. Is response to prophylactic lithium a familial trait? Journal of Clinical Psychiatry 2002, 63, 942–947. 6. 6.Hou, L.; Heilbronner, U.; Degenhardt, F.; Adli, M.; Akiyama, K.; Akula, N.; others. Genetic variants associated with response to lithium treatment in bipolar disorder: A genome-wide association study. The Lancet 2016, 387, 1085–1093. 7. 7.Nunes, A.; Ardau, R.; Berghöfer, A.; Bocchetta, A.; Chillotti, C.; Deiana, V.; Garnham, J.; Grof, E.; Hajek, T.; Manchia, M.; Müller-Oerlinghausen, B.; Pinna, M.; Pisanu, C.; O’Donovan, C.; Severino, G.; Slaney, C.; Suwalska, A.; Zvolsky, P.; Cervantes, P.; Del Zompo, M.; Grof, P.; Rybakowski, J.; Tondo, L.; Trappenberg, T.; Alda, M. Prediction of Lithium Response using Clinical Data. Acta Psychiatrica Scandinavica 2019, *In Press*. 8. 8.Manchia, M.; Adli, M.; Akula, N.; Ardau, R.; Aubry, J.; Backlund, L.; others. Assessment of Response to Lithium Maintenance Treatment in Bipolar Disorder: A Consortium on Lithium Genetics (ConLiGen) Report. PLoS ONE 2013, 8. 9. 9.Turecki, G.; Grof, P.; Cavazzoni, P.; Duffy, A.; Grof, E.; Ahrens, B.; others. Evidence for a role of phospholipase C-γ1 in the pathogenesis of bipolar disorder. Molecular Psychiatry 1998, 3, 534–538. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/sj.mp.4000447&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9857980&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000076762500015&link_type=ISI) 10. 10.Turecki, G.; Grof, P.; Grof, E.; D’souza, V.; Lebuis, L.; Marineau, C.; Cavazzoni, P.; Duffy, A.; Bétard, C.; Zvolský, P.; Robertson, C.; Brewer, C.; Hudson, T.; Rouleau, G.; Alda, M. Mapping susceptibility genes for bipolar disorder: A pharmacogenetic approach based on excellent response to lithium. Molecular Psychiatry 2001, 6, 570–578. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/sj.mp.4000888&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11526471&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000170346900011&link_type=ISI) 11. 11.Nunes, A.; Alda, M.; Bardouille, T.; Trappenberg, T. Representational Rényi heterogeneity. Entropy 2020, 22(4). 12. 12.Breiman, L. Random Forests. Machine Learning 2001, 45, 5–32. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/CBO9781107415324.004&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=WOS:00017048&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) 13. 13.Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; others. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2012, 12, 2825–2830. 14. 14.Mi, H.; Muruganujan, A.; Ebert, D.; Huang, X.; Thomas, P. PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Research 2018, 47, D419–D426. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gky1038&link_type=DOI) 15. 15.Passmore, M.; Garnham, J.; Duffy, A.; MacDougall, M.; Munro, A.; Slaney, C.; Teehan, A.; Martin, A. Phenotypic spectra of bipolar disorder in responders to lithium versus lamotrigine. Bipolar Disorders 2003, 5, 110–114. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1034/j.1399-5618.2003.00026.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12680900&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000182101100006&link_type=ISI) 16. 16.Kleindienst, N.; Engel, R.; Greil, W. Which clinical factors predict response to prophylactic lithium? A systematic review for bipolar disorders. Bipolar disorders 2005, 7, 404–17. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1399-5618.2005.00244.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16176433&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000231980700002&link_type=ISI) 17. 17.Hui, T.; Kandola, A.; Shen, L.; Lewis, G.; Osborn, D.; Geddes, J.; Hayes, J. A systematic review and meta-analysis of clinical predictors of lithium response in bipolar disorder. Acta Psychiatrica Scandinavica 2019,140, 94–115. 18. 18.Backlund, L.; Ehnvall, A.; Hetta, J.; Isacsson, G.; Ågren, H. Identifying predictors for good lithium response - A retrospective analysis of 100 patients with bipolar disorder using a life-charting method. European Psychiatry 2009, 24, 171–177. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.eurpsy.2008.12.009&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19286354&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) 19. 19.Tondo, L.; Hennen, J.; Baldessarini, R. Rapid-cycling bipolar disorder: Effects of long-term treatments. Acta Psychiatrica Scandinavica 2003, 108, 4–14. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12807371&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000183516700002&link_type=ISI) 20. 20.Kleindienst, N.; Greil, W. Differential efficacy of lithium and carbamazepine in the prophylaxis of bipolar disorder: Results of the MAP study. Neuropsychobiology 2000, 42, 2–10. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1159/000054844&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11093063&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000165821500002&link_type=ISI) 21. 21.Gershon, S.; Chengappa, K.; Malhi, G. Lithium specificity in bipolar illness: A classic agent for the classic disorder. Bipolar Disorders 2009,11, 34–44. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1399-5618.2009.00709.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19538684&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) 22. 22.Alda, M. The phenotypic spectra of bipolar disorder. European Neuropsychopharmacology 2004, 14. 23. 23.Kendler, K. From Many to One to Many - The Search for Causes of Psychiatric Illness. JAMA Psychiatry 2019, 76, 1085–1091. 24. 24.Cruceanu, C.; Schmouth, J.; Torres-Platas, S.; Lopez, J.; Ambalavanan, A.; Darcq, E.; Gross, F.; Breton, B. Spiegelman, D.; Rochefort, D.; Hince, P.; Petite, J. Gauthier, J.; Lafrenifère, R.; Dion, P.; Greenwood, C. Kieffer, B.; Alda, M.; Turecki, G.; Rouleau, G. Rare susceptibility variants for bipolar disorder suggest a role for G protein-coupled receptors. Molecular Psychiatry 2018, 23, 2050–2056. 25. 25.Bezchlibnyk, Y.; Young, L. The Neurobiology of Bipolar Disorder: Focus on Signal Transduction Pathways and the Regulation of Gene Expression. Canadian journal of psychiatry. Revue canadienne de psychiatrie 2002, 47, 135–148. 26. 26.Gonzalez-Maeso, J.; Meana, J. Heterotrimeric G Proteins: Insights into the Neurobiology of Mood Disorders. Current Neuropharmacology 2006, 4, 127–138. 27. 27.Saxena, A.; Scaini, G.; Bavaresco, D.; Leite, C.; Valvassoria, S.; Carvalho, A.; Quevedo, J. Role of Protein Kinase C in Bipolar Disorder: A Review of the Current Literature. Molecular Neuropsychiatry 2017, 3, 108–124. 28. 28.Vosahlikova, M.; Svoboda, P. Lithium – Therapeutic tool endowed with multiple beneficiary effects caused by multiple mechanisms. Acta Neurobiologiae Experimentalis 2016, 76, 1–19. 29. 29.Mertens, J.; Wang, Q.; Kim, Y.; Yu, D.; Pham, S.; Yang, B.; others. Differential responses to lithium in hyperexcitable neurons from patients with bipolar disorder. Nature 2015, 527, 95–99. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature15526&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26524527&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F05%2F22%2F2020.05.19.20098566.atom) [1]: /embed/inline-graphic-1.gif [2]: /embed/inline-graphic-2.gif [3]: /embed/inline-graphic-3.gif [4]: /embed/inline-graphic-4.gif [5]: /embed/inline-graphic-5.gif [6]: /embed/graphic-3.gif [7]: /embed/inline-graphic-6.gif [8]: /embed/inline-graphic-7.gif [9]: /embed/inline-graphic-8.gif [10]: /embed/inline-graphic-9.gif [11]: /embed/inline-graphic-10.gif [12]: /embed/inline-graphic-11.gif [13]: /embed/inline-graphic-12.gif [14]: /embed/graphic-4.gif [15]: F2/embed/inline-graphic-13.gif [16]: F2/embed/inline-graphic-14.gif [17]: /embed/inline-graphic-15.gif [18]: /embed/graphic-6.gif [19]: /embed/inline-graphic-16.gif [20]: /embed/inline-graphic-17.gif [21]: /embed/inline-graphic-18.gif [22]: /embed/inline-graphic-19.gif [23]: /embed/inline-graphic-20.gif [24]: /embed/graphic-7.gif [25]: /embed/inline-graphic-21.gif [26]: /embed/graphic-8.gif [27]: /embed/inline-graphic-22.gif