Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn's disease

Abstract

Gene expression profiling can be used to uncover the mechanisms by which loci identified through genome-wide association studies (GWAS) contribute to pathology1,2. Given that most GWAS hits are in putative regulatory regions and transcript abundance is physiologically closer to the phenotype of interest2, we hypothesized that summation of risk-allele-associated gene expression, namely a transcriptional risk score (TRS), should provide accurate estimates of disease risk. We integrate summary-level GWAS and expression quantitative trait locus (eQTL) data with RNA-seq data from the RISK study, an inception cohort of pediatric Crohn's disease3,4. We show that TRSs based on genes regulated by variants linked to inflammatory bowel disease (IBD) not only outperform genetic risk scores (GRSs) in distinguishing Crohn's disease from healthy samples, but also serve to identify patients who in time will progress to complicated disease. Our dissection of eQTL effects may be used to distinguish genes whose association with disease is through promotion versus protection, thereby linking statistical association to biological mechanism. The TRS approach constitutes a potential strategy for personalized medicine that enhances inference from static genotypic risk assessment.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Transcriptional risk scores integrate GWAS and eQTL results to measure individual risk of disease based on transcript abundance.
Figure 2: Transcriptional risk scores based on ileal gene expression at diagnosis distinguish status and course of Crohn's disease.
Figure 3: Gene expression polarized according to predicted direction of risk uncovers two divergent mechanisms of association with disease.
Figure 4: Incoherent genes show similar patterns in stimulated immune cells and are more weakly associated with IBD according to GWAS.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Referenced accessions

Gene Expression Omnibus

References

  1. Fairfax, B.P. & Knight, J.C. Genetics of gene expression in immunity to infection. Curr. Opin. Immunol. 30, 63–71 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Gibson, G., Powell, J.E. & Marigorta, U.M. Expression quantitative trait locus analysis for translational medicine. Genome Med. 7, 60 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Haberman, Y. et al. Pediatric Crohn disease patients exhibit specific ileal transcriptome and microbiome signature. J. Clin. Invest. 124, 3617–3633 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Kugathasan, S. et al. Prediction of complicated disease course for children newly diagnosed with Crohn's disease: a multicentre inception cohort study. Lancet 389, 1710–1718 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Witte, J.S., Visscher, P.M. & Wray, N.R. The contribution of genetic variants to disease depends on the ruler. Nat. Rev. Genet. 15, 765–776 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Wray, N.R., Yang, J., Goddard, M.E. & Visscher, P.M. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Wray, N.R., Goddard, M.E. & Visscher, P.M. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 17, 1520–1528 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Walters, T.D. et al. Increased effectiveness of early therapy with anti–tumor necrosis factor-α vs an immunomodulator in children with Crohn's disease. Gastroenterology 146, 383–391 (2014).

    Article  CAS  PubMed  Google Scholar 

  10. Liu, J.Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Westra, H.J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Kabakchiev, B. & Silverberg, M.S. Expression quantitative trait loci analysis identifies associations between genotype and gene expression in human intestine. Gastroenterology 144, 1488–1496 (2013).

    Article  CAS  PubMed  Google Scholar 

  13. GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  14. Di Narzo, A.F. et al. Blood and intestine eQTLs from an anti-TNF-resistant Crohn's disease cohort inform IBD genetic association loci. Clin. Transl. Gastroenterol. 7, e177 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    Article  CAS  PubMed  Google Scholar 

  18. Lee, J.C. et al. Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease. Nat. Genet. 49, 262–268 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Ning, K. et al. Improved integrative framework combining association data with gene expression features to prioritize Crohn's disease genes. Hum. Mol. Genet. 24, 4147–4157 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Singh, T. et al. Characterization of expression quantitative trait loci in the human colon. Inflamm. Bowel Dis. 21, 251–256 (2015).

    Article  CAS  PubMed  Google Scholar 

  21. Albert, F.W. & Kruglyak, L. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).

    Article  CAS  PubMed  Google Scholar 

  22. Gibson, G. & Weir, B. The quantitative genetics of transcription. Trends Genet. 21, 616–623 (2005).

    Article  CAS  PubMed  Google Scholar 

  23. de Souza, H.S. & Fiocchi, C. Immunopathogenesis of IBD: current state of the art. Nat. Rev. Gastroenterol. Hepatol. 13, 13–27 (2016).

    Article  CAS  PubMed  Google Scholar 

  24. McGovern, D.P., Kugathasan, S. & Cho, J.H. Genetics of inflammatory bowel diseases. Gastroenterology 149, 1163–1176 (2015).

    Article  CAS  PubMed  Google Scholar 

  25. Nabekura, T. et al. Costimulatory molecule DNAM-1 is essential for optimal differentiation of memory natural killer cells during mouse cytomegalovirus infection. Immunity 40, 225–234 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Martinet, L. & Smyth, M.J. Balancing natural killer cell activation through paired receptors. Nat. Rev. Immunol. 15, 243–254 (2015).

    Article  CAS  PubMed  Google Scholar 

  27. Petrillo, M.G. et al. GITR+ regulatory T cells in the treatment of autoimmune diseases. Autoimmun. Rev. 14, 117–126 (2015).

    Article  CAS  PubMed  Google Scholar 

  28. Reikvam, D.H. et al. Increase of regulatory T cells in ileal mucosa of untreated pediatric Crohn's disease patients. Scand. J. Gastroenterol. 46, 550–560 (2011).

    Article  CAS  PubMed  Google Scholar 

  29. Ye, C.J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Wiley, S.E. et al. The outer mitochondrial membrane protein mitoNEET contains a novel redox-active 2Fe-2S cluster. J. Biol. Chem. 282, 23745–23749 (2007).

    Article  CAS  PubMed  Google Scholar 

  31. Novak, E.A. & Mollen, K.P. Mitochondrial dysfunction in inflammatory bowel disease. Front. Cell Dev. Biol. 3, 62 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Levine, A. et al. Pediatric modification of the Montreal classification for inflammatory bowel disease: the Paris classification. Inflamm. Bowel Dis. 17, 1314–1321 (2011).

    Article  PubMed  Google Scholar 

  33. Satsangi, J., Silverberg, M.S., Vermeire, S. & Colombel, J.F. The Montreal classification of inflammatory bowel disease: controversies, consensus, and implications. Gut 55, 749–753 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Cleynen, I. et al. Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study. Lancet 387, 156–167 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  37. Anders, S., Pyl, P.T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).

    CAS  PubMed  Google Scholar 

  38. Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  PubMed  Google Scholar 

  39. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  PubMed  Google Scholar 

  40. Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 47, 381–386 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Leek, J.T., Johnson, W.E., Parker, H.S., Jaffe, A.E. & Storey, J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Mecham, B.H., Nelson, P.S. & Storey, J.D. Supervised normalization of microarrays. Bioinformatics 26, 1308–1315 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Chun, S. et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat. Genet. 49, 600–605 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Lee, M.N. et al. Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science 343, 1246980 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We are grateful to B. Zeng, D. Arafat, H. Somineni, S. Venkateswaran, and colleagues from the Gibson and Kugathasan laboratories for their support and helpful comments. We also would like to thank I. Mendizabal, J. Lachance and K. Jordan for comments on the manuscript. This research was supported by Project 3 (G.G., PI) of the NIH program project “Statistical and Quantitative Genetics” grant P01-GM0996568 (B. Weir, University of Washington, Director) as well as research grants from the Crohn's and Colitis Foundation of America (CCFA), New York, to the individual study institutions participating in the RISK study.

Author information

Authors and Affiliations

Authors

Contributions

U.M.M. and G.G. conceived the theoretical framework for the TRSs. L.A.D., J.S.H. and S.K. participated in the conception and design of the RISK study. K.M., J.P., T.D.W., A.G., J.D.N., W.V.C., J.R.R., D.R.M., R.K., M.B.H., S.S.B., M.C.S., R.N.B., J.F.M., M.C.D., B.J.A., M.-O. K. and J.C. recruited subjects, collected the data, and worked on its curation and analysis. U.M.M. performed the TRS analyses. U.M.M. and G.G. interpreted the results and drafted the manuscript, while L.A.D., J.S.H. and S.K. assisted with results interpretation and writing.

Corresponding author

Correspondence to Greg Gibson.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Replicability of blood eQTL effects in ileal biopsies from the RISK study.

eQTLs detected in the vicinity of SNPs associated with IBD tend to show concordant effect size and direction in blood and ileum. The effects of 136 eQTLs available in ileum are shown (Supplementary Table 2). The x axis shows the β values for the eQTLs detected in peripheral blood from the Blood eQTL browser; the y axis shows the β values in the eQTL mapping study with ileal biopsies from the RISK study (see “Mapping study in the RISK cohort to build the ileal TRS” in the Online Methods). The dashed best-fitting least-squares regression line corresponds to Spearman r = 0.54 (P = 2 × 10−11). Values in the corners indicate the percentage of loci in each quadrant, showing that 70% are concordant in direction of effect in the two tissues (P = 1.7 × 10−6, sign test).

Supplementary Figure 2 Performance of the GRS and TRS based on the initial set of 157 candidate genes.

(a,b) Each plot shows the TRS based on 157 IBD genes associated with 96 eQTLs that are also associated with IBD or in LD with a SNP associated with the disease (Supplementary Table 1). The discriminatory performance of the GRS versus TRS based on these genes is shown for disease status: comparison of samples with Crohn’s disease (n = 210) and controls (n = 35) (a) and disease course (3-year period after diagnosis): comparison of samples that remain in non-complicated Crohn’s disease (B1; n = 183) and those that develop complicated disease (B2 and/or B3; n = 27) (b). The standardized GRS and TRS are shown on the y axis. Differences between groups (in s.d. units) along with P values (two-sided t test) are reported for each comparison.

Supplementary Figure 3 Selection of genes based on SMR and coloc results.

(a) Each point represents the –log10 (P value) (NLP) for the blood eQTL association and Crohn’s disease GWAS association for 157 candidate genes. Colors represent the significance of the SMR statistic, clearly showing that the most highly significant genes are strongly associated with both traits. Similar plots are observed for ulcerative colitis and IBD. All 39 genes with SMR P < 2.3 × 10−4 (red and brown dots) for all three disease classifications were included in the final SMR-based TRS. (b) The coloc H4 score estimates the posterior probability that the same causal variant drives both the GWAS and eQTL associations. This plot shows that poor SMR values (small NLPs) tend also to have low coloc H4 scores; however, only approximately half of the strong SMR values (large NLPs) have strong coloc H4 posterior probabilities. The 29 genes with coloc H4 greater than 0.8 for the three disease phenotypes were included in the final coloc-based TRS. This includes 14 genes not in the SMR set.

Supplementary Figure 4 Relationship between transcriptional risk scores and location of inflammation.

Because the Paris classification of pediatric Crohn’s disease includes location of disease, which was strongly correlated with the degree of inflammation in the ileum from which biopsies were obtained, we plot here the relationship between disease location and the 29-gene coloc-derived TRS. RISK study patients were classified into two categories according to the presence/absence of visible ileal inflammation in endoscopies performed at diagnosis (L1 (ileum-only) and L3 (ileocolonic) cases were classified as ‘inflamed’; L2 (colonic-only) cases were classified as ‘non-inflamed’). Only two of the cases that progressed to complicated disease were non-inflamed, which are not shown owing to low sample size. The TRS is slightly elevated in inflamed versus endoscopically non-inflamed B1 cases (P < 0.02) and is also elevated in B1 cases with non-inflamed ilea as compared to non-IBD controls (P < 1 × 10−6), confirming that the TRS picks up a signal that is related but complementary to inflammation. Complicated cases have an elevated TRS even relative to inflamed B1 cases (P < 7 × 10−4). A box plot of values is shown for each group along with P values for pairwise comparisons (two-sided t test).

Supplementary Figure 5 Performance of the GRS and TRS based on 39 susceptibility genes detected by SMR.

(a,b) Thirty-nine genes were detected by SMR as being under the control of 29 causal variants that account for the association detected by GWAS and the eQTL effect reported in the Blood eQTL browser (Supplementary Table 4). The performance of the GRS verus TRS based on these genes is shown for disease status: comparison of samples with Crohn’s disease (n = 210) versus non-IBD controls (n = 35) (a) and disease course (3-year period after diagnosis): comparison of samples that remain in non-complicated Crohn’s disease (B1; n = 183) versus those that develop complicated disease (B2 and/or B3; n = 27) (b). The standardized GRS and TRS are shown on the y axis. Differences between groups (in s.d. units) along with P values (two-sided t test) are reported for each comparison.

Supplementary Figure 6 Performance of PRSs based on LD-pruned variants at different significance inclusion thresholds.

(a,b) PRSs at different thresholds (Online Methods) successfully separate Crohn’s disease cases from non-IBD controls (a) but fail to distinguish according to development of complicated disease (b). The performance of PRSs using SNPs that pass a range of liberal P-value thresholds in GWAS analysis is shown (the inclusion threshold and total number of variants used are reported on the y axis). Differences between groups (in s.d. units) along with P values for each comparison are reported on the x axis.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–6 (PDF 1128 kb)

Life Sciences Reporting Summary (PDF 170 kb)

Supplementary Table 1

eQTL association data in peripheral blood for 232 SNPs associated with IBD with genes <1 Mb away (7,389 SNP–gene pairs). (XLSX 1169 kb)

Supplementary Table 2

Replicability of blood eQTL effects in ileal tissue from the RISK study. (XLSX 58 kb)

Supplementary Table 3

coloc results for 163 SNP–gene pairs selected from the Blood eQTL browser. (XLSX 79 kb)

Supplementary Table 4

SMR results for 163 SNP–gene pairs selected from the Blood eQTL browser. (XLSX 86 kb)

Supplementary Table 5

eQTL association and coloc results for 46 genes controlled by SNPs associated with IBD in the RISK ileal eQTL mapping study. (XLSX 60 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marigorta, U., Denson, L., Hyams, J. et al. Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn's disease. Nat Genet 49, 1517–1521 (2017). https://doi.org/10.1038/ng.3936

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3936

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing