ABSTRACT
Substance use disorders (SUDs) incur serious social and personal costs. Risk for SUDs is complex, ranging from social conditions to individual genetic variation. We examined whether models that include a clinical/environmental risk index (CERI) and polygenic scores (PGS) are able to identify individuals at increased risk of SUD in young adulthood across four longitudinal cohorts for a combined sample of N = 15,134. Our analyses included participants of European (NEUR = 12,659) and African (NAFR = 2,475) ancestries. SUD outcomes included: 1) alcohol dependence, 2) nicotine dependence; 3) drug dependence, and 4) any substance dependence. In the models containing the PGS and CERI, the CERI was associated with all three outcomes (ORs = 1.37 – 1.67). PGS for problematic alcohol use, externalizing, and smoking quantity were associated with alcohol dependence, drug dependence, and nicotine dependence, respectively (OR = 1.11 – 1.33). PGS for problematic alcohol use and externalizing were also associated with any substance dependence (ORs = 1.09 – 1.18). The full model explained 6% - 13% of the variance in SUDs. Those in the top 10% of CERI and PGS had relative risk ratios of 3.86 - 8.04 for each SUD relative to the bottom 90%. Overall, the combined measures of clinical, environmental, and genetic risk demonstrated modest ability to distinguish between affected and unaffected individuals in young adulthood. PGS were significant but added little in addition to the clinical/environmental risk index. Results from our analysis demonstrate there is still considerable work to be done before tools such as these are ready for clinical applications.
INTRODUCTION
Substance use disorders (SUDs) are associated with substantial costs to affected individuals, their families, and society. An estimated 107107,000 Americans died as the result of an overdose in 2021 1. In 2016, alcohol use contributed 4.2% to the global disease burden and other drug use contributed 1.3% 2. Excessive alcohol use and illicit drug use cost the United States an annual $250 billion 3 and $190 billion 4 respectively. Given the substantial human and economic costs of substance misuse and disorders, understanding the combined impact of important risk factors across multiple levels of analysis has important public health implications.
Substance use disorders are complex phenomena, and the development of substance related problems can be attributed to factors ranging from broader social and economic conditions to individual genetic variation 5–10. Prior research using a multifactorial index of clinical and environmental risk factors (e.g., childhood disadvantage, family history of SUD, childhood conduct problems, childhood depression, early exposure to substances, frequent use during adolescence) found it useful in identifying those with persistent SUDs 11.
More recently, polygenic scores (PGS), which aggregate risk for a trait across the genome using information from genome-wide association studies (GWAS), were robustly associated with substance use 12 and substance related problems 13 across adolescence and into young adulthood. However, though robustly associated, current PGS do poorly in identifying individuals affected by SUDs 14. To date, there is limited work on the combined impact of genetic, environmental, and clinical risk factors for SUDs. Prior work combining individual genetic variants and clinical features outperformed clinical features alone 15, but individual variants have limited predictive power. In other medical conditions, such as melanoma 16 or ischemic stroke 17, combining clinical and genetic risk factors showed improvement predicting risk for a specific outcome over models using individual risk factors.
In the current study, we examine the joint association of early life clinical/environmental risk factors and PGS with SUDs in early adulthood across four longitudinal cohorts: the National Longitudinal Study of Adolescent to Adult Health (Add Health); the Avon Longitudinal Study of Parents and Children (ALSPAC); the Collaborative Study on the Genetics of Alcoholism (COGA); and the youngest cohort of the Finnish Twin Cohort Study (FinnTwin12). These samples include population-based cohorts from three countries (United States, England, and Finland) and a predominantly high-risk sample. Two of the samples (COGA and Add Health) are ancestrally diverse. We focus on early adulthood as this is a critical period for the development and onset of SUDs 18. Our research questions are guided by the understanding that risk factors for SUDs range across multiple levels of analysis.
METHODS
Samples
Add Health is a nationally representative longitudinal study of adolescents followed into adulthood in the United States 19. Data have been collected from Wave I when respondents were between 11-18 (1994-1995) to Wave V (2016-2018) when respondents were 35-42. The current analysis uses data from Waves I, II, and Wave IV.
ALSPAC is an ongoing, longitudinal population-based study of a birth cohort in the (former) Avon district of Southwest England 20–23. Pregnant female residents with an expected date of delivery between April 1, 1991 and December 31, 1992 were invited to participate (N = 14,541 pregnant women, 80% of those eligible). This analysis uses data up to the age 24 assessment (details of all the data that is available through a searchable, web-based tool: http://www.bristol.ac.uk/alspac/researchers/our-data/).
COGA is a family-based sample consisting of alcohol dependent individuals (identified through treatment centers across the United States), their extended families, and community controls (N ∼16,000) 24,25. We use a prospective sample of offspring of the original COGA participants (baseline ages 12-22, N = 3,573) that have been assessed biennially since recruitment (2004-2019) 26.
FinnTwin12 is a population-based study of Finnish twins born 1983–1987 identified through Finland’s Central Population Registry. A total of 2,705 families (87% of all identified) returned the initial family questionnaire late in the year in which twins reached age 11 27. Twins were invited to participate in follow-up surveys when they were ages 14, 17, and approximately 22.
Each cohort includes a wide range of social, behavioral, and phenotypic data measured across the life course. The SUD measures were derived from the corresponding young adult phases of data collection in each cohort (mean ages ∼ 22 - 28). A full description of each sample is presented in the supplementary information (section 2).
Measures
Lifetime Diagnosis of Substance Use Disorder
We constructed measures of lifetime SUD diagnosis based on the data that were available in each of the samples, defined as meeting criteria for four, non-mutually exclusive categories of substance dependence: 1) alcohol dependence; 2) nicotine dependence; 3) drug dependence (inclusive of drugs such as cannabis, cocaine, opioids, sedatives, etc.); and 4) any substance dependence (alcohol, nicotine, or drug). Our analyses focused primarily on DSM-IV as this diagnostic system was most consistently used across all samples. There was one exception: in each of the samples, nicotine dependence was measured using a cutoff of 7 or higher on the Fagerstrom Test for Nicotine Dependence (FTND)28. Where possible, we drew measures of substance dependence from data collected during young adulthood to try and maintain temporal ordering between SUD diagnoses and measured risk factors.
Clinical/Environmental Risk Index
We created a clinical/environmental risk index (CERI) considering a variety of established risk factors for SUD (Table 1). The CERI included ten validated early life risk factors associated with later development of SUDs, including: low childhood socioeconomic status (SES), family history of SUD, early initiation of substance use, childhood internalizing problems, childhood externalizing problems, frequent drinking in adolescence, frequent smoking in adolescence, frequent cannabis use in adolescence, peer substance use, and exposure to trauma/traumatic experiences11,29,30. We dichotomized each risk factor (present vs not present) and summed them into an index for each person ranging from 0 to 10, providing a single measure of aggregate risk. Dichotomizing these items allowed us to harmonize measures across each sample in an interpretable manner. A full list of how each measure is defined within each of the samples is available in the supplementary information (section 3).
Polygenic Scores
We constructed polygenic scores (PGS), which are aggregate measures of the number of risk alleles individuals carry weighted by effect sizes from GWAS summary statistics, from six recent GWAS of SUDs and comorbid conditions including: 1) externalizing problems (EXT) 31; 2) major depressive disorder (MDD) 32; 3) problematic alcohol use 33 (ALCP); 4) alcohol consumption (drinks per week, ALCC) 34,35; 5) cigarettes per day/FTND (CPD) 34,36; and 6) schizophrenia (SCZ) 37,38. We focused on these PGS, specifically, because: 1) SUDs show strong genetic overlap with other externalizing39–41, internalizing 32,42, and psychotic disorders 33,43,44; 2) both shared and substance-specific genetic risk are associated with later SUDs 45–47; and 3) substance use and SUDs have only partial genetic overlap 48,49. Therefore, our PGS cover a spectrum of genetic risk for SUDs, using the most current and well-powered results for each of the listed domains (see supplementary information section 4 for a detailed description).
GWAS have been overwhelmingly limited to individuals of European ancestries 50,51. Importantly, PGS derived from GWAS of one ancestry do not always transport into other ancestral populations 52,53. We therefore used PRS-CSx 54, a new method that combines information from well-powered GWAS (typically of European ancestries) and ancestrally matched GWAS to improve the predictive power of PGS in the African ancestry samples from Add Health and COGA. PRS-CSx integrates GWAS summary statistics across multiple input populations and employs a Bayesian approach to correct GWAS summary statistics for the non-independence of SNPs in linkage disequilibrium (LD) with one another54. For participants of European ancestries, we used the EUR derived PRS-CSx results, while we used the EUR+AFR meta-analyzed results for the African ancestry participants. See the supplementary information (section 5) for details.
Analytic Strategy
We pooled all the data for analysis using a fixed effects integrative data analytic (IDA) approach 55. The IDA approach is more powerful than traditional meta-analyses when one has access to raw data for each of the contributing samples. Our approach to harmonization and pooling was as follows. First, we defined the measures and cutoffs to be used in each of the samples, creating the CERI, PGS, and SUD outcomes at the cohort level. Second, within each cohort, we regressed each PGS on age, age2, sex, sex*age, sex*age2, and the first 10 ancestral PCs (specific to each sample) to account for population stratification in the PGS. Next, we pooled all the data for analysis. We included cohort as a fixed effect for each of the six cohorts (4 samples, of which two were split by ancestry) in subsequent analyses. Additionally, we included age of last observation and sex as covariates.
We estimated a series of nested logistic regression models with the pooled data: 1) a baseline model (sex, age, and cohort), 2) a genetic risk model (baseline + PGS), 3) a clinical/environmental risk model (baseline + CERI), and 4) a combined risk model (baseline + PGS + CERI). Because COGA and FT12 included a large number of related individuals, we adjusted for familial clustering using cluster-robust standard errors 56. To assess the predictive accuracy of each model, we took the difference in pseudo-R2 (ΔPseudo-R2) 57, between the baseline and corresponding models. Finally, we calculated the discriminatory power of the combined model using the area under the curve (AUC) from a receiver operating characteristic (ROC) curve. To ensure the robustness of our results, we included a variety of checks to ensure that no single cohort in the IDA was unduly influencing the results. Our analytic strategy was preregistered on the Open Science Framework (https://osf.io/etbw8). Deviations from the preregistration are described in the supplementary information (section 6).
RESULTS
Table 2 contains the descriptive statistics for each of the cohorts and ancestries. Each cohort had similar proportions of females (∼51% - 56%). The mean ages ranged from ∼22 to ∼29 years of age. The COGA cohorts (both European and African ancestries) reported the highest rates of SUD, an expected finding given the nature of the sample (highly selected for SUDs). Add Health participants generally had higher rates of SUD than ALSPAC or FinnTwin12, but lower than COGA. Finally, ALSPAC and FinnTwin12 reported similar levels of alcohol, nicotine, drug, and any substance dependence. COGA participants reported higher mean values on the CERI. The remaining cohorts report relatively similar rates of risk factor exposure.
Table 3 presents the results from the PGS only, CERI only, and combined models for each outcome. Three of the six PGS were associated with the SUD outcomes in the PGS only model. EXT was associated with each of the SUD outcomes (EXT OR = 1.18 – 1.50); ALCP was associated with alcohol dependence and any substance dependence (ALCP OR = 1.10 – 1.13); and CPD was associated with nicotine dependence (CPD OR = 1.33). In the CERI only models, the CERI was consistently associated across each of the SUD categories (ORs = 1.37 – 1.67). When we combined the PGS and CERI into the same model, the CERI remained significant across SUDs and was largely unchanged (ORs = 1.35 – 1.65). EXT remained associated with drug dependence (OR = 1.11) and nicotine dependence (OR = 1.33), ALCP remained associated alcohol dependence (OR = 1.12), and CPD remained associated with nicotine dependence (OR = 1.31). Both EXT and ALCP remained associated with any substance dependence diagnosis (ORs = 1.09 – 1.18). Overall, the combined model explained 5.9%, 12.6%, 13.1%, and 12.8% of the variance in alcohol dependence, nicotine dependence, drug dependence, and any substance dependence, respectively.
Figure 1 (Panel A) presents the raw prevalence for each outcome across counts of the CERI. The proportion of those meeting criteria for SUDs among those reporting 3 or more, 5 or more, and 7 or more risk factors surpassed lifetime prevalence estimates from nationally representative samples for drug dependence, alcohol dependence, and nicotine dependence, respectively 58. Panel B depicts the prevalence of each category of SUD across several mutually exclusive categories: 1) those in the bottom 90% of both the CERI and all PGS (averaged across the six scores); 2) those in the top 10% of the CERI but the bottom 90% of the PGS distribution; 3) those in the top 10% of the PGS distribution and the bottom 90% of the CERI; and 4) those in the top 10% of both PGS and the CERI. There is an increase in risk across those with elevated genetic risk, clinical/environmental risk, and both. Those in the top 10% of both PGS and CERI had the highest prevalence of each of the SUDs, though the error bars overlap with the estimates from those in the top 10% of the risk index, alone. Compared to those in the bottom 90% on both, those in the to the top 10% of both have a relative risk of 3.86 (95% CI = 3.20, 4.65) for alcohol dependence, 6.11 (95% CI = 4.84, 7.72) for nicotine dependence, 8.04 (95% CI = 6.92, 9.36) for drug dependence, and 4.05 (95% CI = 3.64, 4.51) for any substance dependence.
Panel A: Prevalence (and 95% confidence intervals) of those who meet criteria for alcohol, nicotine, drug, or any substance dependence across counts for items in the risk index. Panel B: Prevalence (and 95% confidence intervals) of those who meet criteria for alcohol, nicotine, drug, or any substance dependence across four categories: 1) those below the 90th percentile for all PGS and the CERI; 2) those at or above the 90th percentile for the CERI; 3) those at or above the 90th percentile for all PGS; and 4) those at or above the 90th percentile for both the CERI and PGS. PGS and risk index were first residualized on sex, age, age2, cohort, sex*age, sex*age2, sex*cohort, cohort*age, cohort*age2, sex*cohort*age, and sex*cohort*age2. Dotted colored lines represent corresponding lifetime prevalence estimates for alcohol dependence (red), nicotine dependence (green), drug dependence (blue), and any substance use disorder (purple) from nationally representative data 58.
Finally, we considered the AUC for the combined model for each of the SUD categories. Figure 2 presents the ROC curves for the full (CERI and PGS) and baseline (covariates only) models for each SUD category. The AUC for each combined model was 0.74 for alcohol dependence, 0.82 for nicotine dependence, 0.86 for drug dependence, and 0.78 for any substance dependence. The overall change in AUC (from the baseline to the full model) that we achieve when adding the CERI and PGS was modest (ΔAUC = 0.05 – 0.10), and this improvement was due in large part to the explanatory power of the CERI. ROC curves for the CERI only and PGS only models are presented in Supplemental Figure 6.
Receiver operating characteristic (ROC) curves for baseline models (red line, covariates only) and the full models (blue line, PGS + CERI + covariates) for each substance use disorder. Area under the curve (AUC) is presented for the PGS model in each cell. Change in AUC represents value of the difference between AUC from the full model and AUC from the base model.
Sensitivity Analyses
Lastly, we performed a variety of sensitivity analyses, including: 1) a leave-one-out (LOO) analysis; 2) a sex-stratified analysis, and 3) ancestry-specific analysis. The results from the LOO and sex-stratified analyses were largely similar to those from the main results. Results in the European ancestry cohorts mirrored the main results, with the exception that the CPD PGS was associated with “any substance dependence”. None of the PGS w associated with SUDs in the African ancestry cohorts, but the effect sizes for the CERI were largely similar across European and Africa ancestries (see Supplemental Tables S1-S3).
We also tested for interactions between the PGS and CERI and cohort (Add Health EUR as the reference group). There were few significant interactions and no consistent patterns in variation for PGS, though the CERI did show considerable variation across cohort (Supplemental Table S4). Finally, we fit complimentary models using a random effects approach, allowing the slopes for the PGS and CERI to vary randomly across cohort. Random slopes for PGS did not consistently improve model fit, though a random slope for the CERI consistently improved model fit (Supplemental Table S5). We compared the parameter estimates from the random effect models to the main analyses and results were largely consistent (Supplemental Table S6).
DISCUSSION
Substance use disorders remain a serious threat to public health 59. In the current analysis, we examined the combination of clinical, environmental, and genetic risk factors for determining who is more likely to develop a SUD in early adulthood. We used previously validated measures of environmental and clinical risk 11,29,30 and polygenic scores for externalizing problems 31, major depressive disorder 32, problematic alcohol use 33,35, alcohol consumption 34,35, cigarettes per day/nicotine dependence 34,36, and schizophrenia 37,38. The combination of genetic and social-environmental measures was significantly associated with the development of SUDs. The overall association was strongest for drug dependence, followed by any substance dependence, nicotine dependence, and alcohol dependence.
The CERI was the strongest association with each outcome. The proportion of those meeting criteria for each SUD surpassed lifetime estimates in persons with 3 or more, 5 or more, and 7 or more risk factors for drug dependence, alcohol dependence, and nicotine dependence, respectively. The discriminatory power of the combined model (AUC = .74 - .86) was similar to AUC estimates published in the original paper from which many of the risk index items were derived (AUC ∼ 0.80) 11. Interestingly, this risk index was originally developed for identifying persons with persistent SUD through early mid-life (∼age 40). In the current analysis we demonstrated that the CERI in conjunction with demographic covariates and PGS does equally well for those who meet criteria for any SUD by young adulthood.
The overall predictive power of the PGS alone was in the range of 1.1 – 3.7%. Only the PGS for externalizing problems, problematic alcohol use, and cigarettes per day were consistently associated with SUD outcomes. The PGS for externalizing problems was associated with drug dependence and nicotine dependence, the PGS for problematic alcohol use PGS was associated with alcohol dependence, and both were associated with any substance dependence. The PGS for cigarettes per day was only associated with nicotine dependence. Overall, these results support prior evidence that genetic risk for SUDs consists of a both shared and substance-specific variance 31,41,47.
Interestingly, even though the effect sizes were attenuated in the model, the PGS for externalizing problems, problematic alcohol use, and cigarettes per day remained significantly associated when we included the CERI, though the additional information the PGS provided was minimal. Since the CERI also included many of the phenotypes each of the PGS measured (e.g., childhood conduct disorder for externalizing, childhood depression for major depressive disorder; and frequent alcohol use for alcohol consumption), part of this attenuation is likely due to the inclusion of the actual phenotypes through which risk for some of these disorders is expressed. PGS are also confounded by environmental variance 59 and the reduction in effect sizes could be accounting for some of that confounding. PGS may add information beyond well-known risk factors, which could prove useful when information on certain exposures or behaviors is unavailable.
Further refinement of risk measures may improve our ability to develop screening protocols for those at greater risk of developing substance-related problems. Early detection has the potential to improve prevention efforts, as prior work suggests that those at highest risk of substance misuse stand to benefit the most from prevention efforts 60. Ideally, screening tools for SUD risk would include measures of social, clinical, and genetic risk factors, as each impacts the development of SUDs 5–7,61,62. In the push for precision medicine, very often the focus is on biological information, but social determinants of health are also critically important.
Currently, these tools are not ready for clinical use. If we reach the point where social, clinical, and genetic information become sufficiently powerful, we must recognize that identifying persons for early intervention carries a significant risk. Screening for social determinants has the potential for unintended consequences, including further stigmatization 63. Genetic information has even more potential for abuse and stigmatization. Policy makers must ensure that there is comprehensive legal protection against discrimination using any form of information. Additionally, any attempt to use social, clinical, or genetic information for targeted intervention or identification in a clinical setting must be done so in a patient-centered approach, rather than any “one-size fits all” that exclude patients from their own healthcare decisions 64.
Our analysis has several important limitations. First, although we included individuals of diverse ancestries, the PGS for our samples of African ancestries were severely underpowered due to the small size of the discovery sample. Large-scale GWAS in diverse cohorts are vital to ensuring that any benefit of precision medicine is shared equitably across the population 65. Second, while distinct, ancestry is related to race-ethnicity, and with it, racism and racial discrimination, some of the most profound social determinants of health66. Our measure of environmental risk may not fully capture risk factors that contribute to SUDs in populations beyond non-Hispanic whites. Future studies should include racially relevant measures of risk (e.g., experiences of interpersonal racism/discrimination, racial residential segregation) as well as other social and environmental measures that are known risk factors for SUDs (e.g., neighborhood social conditions, alcohol outlet density). Further refinement of known risk factors may allow for better prediction of those at risk of developing an SUD. Finally, while we tried to ensure time order between risk factors and onset of disorder, some risk factors (particularly adolescent substance use) could have occurred concurrently with diagnosis. Future work in samples with risk factors measured before the initiation of substance use (such as the Adolescent Brain Cognitive Development Study) will be important for replication efforts.
Recognizing that multiple social, clinical, and genetic factors contribute to risk for SUDs is important as we move towards the goal precision medicine that benefits all segments of the population. There is still much work to be done before tools such as these are useful in a clinical setting. However, the results of this integrative data analysis provide initial evidence these risk factors contribute unique information to SUDs in early adulthood. Expanding our sources of information (such as electronic health records, census data from home of record) and making use of increasingly well-powered PGS will continue to improve our ability to identify those who have the greatest risk of developing SUDs.
Data Availability
All data in the present study are available upon application and approval by the constituent cohorts.
The Externalizing Consortium
Principal Investigators: Danielle M. Dick, Philipp Koellinger, K. Paige Harden, Abraham A. Palmer. Lead Analysts: Richard Karlsson Linnér, Travis T. Mallard, Peter B. Barr, Sandra Sanchez-Roige. Significant Contributors: Irwin D. Waldman. The Externalizing Consortium has been supported by the National Institute on Alcohol Abuse and Alcoholism (R01AA015416 - administrative supplement), and the National Institute on Drug Abuse (R01DA050721). Additional funding for investigator effort has been provided by K02AA018755, U10AA008401, P50AA022537, as well as a European Research Council Consolidator Grant (647648 EdGe to Koellinger). The content is solely the responsibility of the authors and does not necessarily represent the official views of the above funding bodies. Add Health: Add Health is directed by Robert A. Hummer and funded by the National Institute on Aging cooperative agreements U01 AG071448 (Hummer) and U01AG071450 (Aiello and Hummer) at the University of North Carolina at Chapel Hill. Waves I-V data are from the Add Health Program Project, grant P01 HD31921 (Harris) from Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), with cooperative funding from 23 other federal agencies and foundations. Add Health was designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill. ALSPAC: We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses. The UK Medical Research Council and Wellcome (Grant ref: 217065/Z/19/Z) and the University of Bristol provide core support for ALSPAC. This publication is the work of the authors, and Peter Barr and Danielle Dick will serve as guarantors for the contents of this paper. A comprehensive list of grants funding is available on the ALSPAC website (http://www.bristol.ac.uk/alspac/external/documents/grant-acknowledgements.pdf); This research was specifically funded by the Medical Research Council (MRC) under grants MR/L022206/1, MR/M006727/1, and G0800612/86812; the Wellcome Trust under grant 086684; and the National Institute on Alcohol Abuse and Alcoholism under 5R01AA018333-05. GWAS data was generated by Sample Logistics and Genotyping Facilities at Wellcome Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe. COGA: We thank The Collaborative Study on the Genetics of Alcoholism (COGA), Principal Investigators B. Porjesz, V. Hesselbrock, T. Foroud; Scientific Director, A. Agrawal; Translational Director, D. Dick, includes eleven different centers: University of Connecticut (V. Hesselbrock); Indiana University (H.J. Edenberg, T. Foroud, Y. Liu, M.H. Plawecki); University of Iowa Carver College of Medicine (S. Kuperman, J. Kramer); SUNY Downstate Health Sciences University (B. Porjesz, J. Meyers, C. Kamarajan, A. Pandey); Washington University in St. Louis (L. Bierut, J. Rice, K. Bucholz, A. Agrawal); University of California at San Diego (M. Schuckit); Rutgers University (J. Tischfield, R. Hart, J. Salvatore); The Children’s Hospital of Philadelphia, University of Pennsylvania (L. Almasy); Virginia Commonwealth University (D. Dick); Icahn School of Medicine at Mount Sinai (A. Goate, P. Slesinger); and Howard University (D. Scott). Other COGA collaborators include: L. Bauer (University of Connecticut); J. Nurnberger Jr., L. Wetherill, X., Xuei, D. Lai, S. O’Connor, (Indiana University); G. Chan (University of Iowa; University of Connecticut); D.B. Chorlian, J. Zhang, P. Barr, S. Kinreich, G. Pandey (SUNY Downstate); N. Mullins (Icahn School of Medicine at Mount Sinai); A. Anokhin, S. Hartz, E. Johnson, V. McCutcheon, S. Saccone (Washington University); J. Moore, Z. Pang, S. Kuo (Rutgers University); A. Merikangas (The Children’s Hospital of Philadelphia and University of Pennsylvania); F. Aliev (Virginia Commonwealth University); H. Chin and A. Parsian are the NIAAA Staff Collaborators. We continue to be inspired by our memories of Henri Begleiter and Theodore Reich, founding PI and Co-PI of COGA, and also owe a debt of gratitude to other past organizers of COGA, including Ting-Kai Li, P. Michael Conneally, Raymond Crowe, and Wendy Reich, for their critical contributions. This national collaborative study is supported by NIH Grant U10AA008401 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA) and the National Institute on Drug Abuse (NIDA). All code necessary to replicate this study is available upon request.
ETHICS DECLARATIONS
The authors have no conflicts of interest to declare.
ACKNOWLEDGEMENTS
Research reported in this publication was supported by the National Institute on Alcohol Abuse and Alcoholism and the National Institute of Drug Abuse of the National Institutes of Health under award numbers R01AA015416, R01DA050721, R01DA042090, and K02AA018755; the Academy of Finland (grants 100499, 205585, 118555, 141054, 265240, 308248, 308698 and 312073); and the Scientific and Technological Research Council of Turkey (TÜBİTAK) under award number 114C117 (FA); and the Sigrid Juselius Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding bodies. This research also used summary data from the Psychiatric Genomics Consortium (PGC), the Million Veterans Program (MVP), the GWAS and Sequencing Consortium for Alcohol and Nicotine (GSCAN), UK Biobank, the Genomic Psychiatry Cohort (GPC) and 23andMe, Inc. We would like to thank the many studies that made these consortia possible, the researchers involved, and the participants in those studies, without whom this effort would not be possible. We would also like to thank the research participants and employees of 23andMe.
Footnotes
Additional analyses, author list updated; Supplemental files updated.