Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Genome-wide association study across five cohorts identifies five novel loci associated with idiopathic pulmonary fibrosis

View ORCID ProfileRichard J Allen, Amy Stockwell, Justin M Oldham, View ORCID ProfileBeatriz Guillen-Guio, View ORCID ProfileCarlos Flores, Imre Noth, Brian L Yaspan, View ORCID ProfileR Gisli Jenkins, View ORCID ProfileLouise V Wain, International IPF Genetics Consortium
doi: https://doi.org/10.1101/2021.12.06.21266509
Richard J Allen
1Department of Health Sciences, University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard J Allen
  • For correspondence: rja34@leicester.ac.uk
Amy Stockwell
2Genentech, South San Francisco, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Justin M Oldham
3Department of Internal Medicine, University of California Davis, Davis, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Beatriz Guillen-Guio
1Department of Health Sciences, University of Leicester, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Beatriz Guillen-Guio
Carlos Flores
4Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, Santa Cruz de Tenerife, Spain
5CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
6Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carlos Flores
Imre Noth
7Division of Pulmonary & Critical Care Medicine, University of Virginia, Charlottesville, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brian L Yaspan
2Genentech, South San Francisco, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R Gisli Jenkins
8National Heart and Lung Institute, Imperial College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for R Gisli Jenkins
Louise V Wain
1Department of Health Sciences, University of Leicester, Leicester, UK
9National Institute for Health Research, Leicester Respiratory Biomedical Research Centre, Glenfield Hospital, Leicester, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Louise V Wain
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Idiopathic pulmonary fibrosis (IPF) is a chronic lung condition with poor survival times. We previously published a genome-wide meta-analysis of IPF risk across three studies with independent replication of associated variants in two additional studies. To maximise power and to generate more accurate effect size estimates, we performed a genome-wide meta-analysis across all five studies included in the previous IPF risk GWAS. We utilised the distribution of effect sizes across the five studies to assess the replicability of the results and identified five robust novel genetic association signals implicating mTOR signalling, telomere maintenance and spindle assembly genes in IPF risk.

Introduction

Idiopathic pulmonary fibrosis (IPF) is a chronic lung disease believed to result from an aberrant response to alveolar injury leading to a build-up of scar tissue. This progressive scarring is eventually fatal with half of individuals dying within 3 to 5 years of diagnosis1. The cause of IPF is unknown but genetics play an important role in how susceptible an individual is to IPF2.

Genome-wide association studies (GWAS) are an approach whereby genetic variants from across the genome are tested for their association with a disease. Genetic loci identified by GWAS can implicate genes important in disease pathogenesis and drugs which target the products encoded by these genetically-supported genes are twice as likely to be successful during development. The genetic association statistics from a GWAS are also widely used to identify causal markers of disease through Mendelian randomisation, to conduct heritability estimation and for genetic correlation analyses. It is therefore important that sample sizes are maximised to ensure sufficient statistical power to detect genetic associations and to generate precise effect size estimates.

We recently published a GWAS of IPF risk2. The discovery GWAS consisted of three studies (named as the UK, Chicago and Colorado studies) and a replication analysis performed in two independent studies (named as the UUS [USA, UK and Spain] and Genentech studies). This analysis reported 14 genetic signals which implicated host defence, cell-cell adhesion, spindle assembly, TGF-β signalling regulation and telomere maintenance as important biological processes involved in IPF disease risk.

We here present a meta-analysis of genome-wide data from all 5 datasets included in our previous study. The results of this analysis implicate new genetic loci in IPF pathogenesis and provide a unique resource for other studies of IPF risk and pathogenesis.

Methods

Quality control and sample selection have been previously described2. In summary, all datasets comprised of unrelated European-ancestry individuals. Individuals in the Genentech study were sequenced using HiSeq X Ten platform (Illumina) and all other individuals were imputed from genotyping data using the HRC reference panel3. Genome-wide analyses were performed in each study separately using an additive logistic regression model adjusting for the first 10 genetic principal components to account for population stratification.

The individual GWAS results from the five studies were meta-analysed using an inverse-variance weighted fixed effect meta-analysis using METAL4. Variants were included in the meta-analysis if they were available in at least four studies. Genomic control was performed on the meta-analysis results using the LD score regression intercept to account for inflation not explained by polygenic effects5. Significant variants were defined as those with meta-analysis p<5×10−8 and conditional analyses were performed using GCTA-COJO to identify additional independent associated variants6. Independent associated variants were defined as variants remaining genome-wide significant after conditioning on the most significant variant (sentinel) in the region with consistent effect size estimates in the conditional and non-conditional analysis. Annotation of the sentinel variants was then performed using Variant Effect Predictor7.

To assess the robustness of novel results, we tested the strength and consistency of results across studies using MAMBA (Meta-Analysis Model-based Assessment of replicability)8. Variants with a posterior probability of replicability (PPR)≥90% were considered robust and likely to replicate should additional independent datasets become available.

Genome-wide summary statistics can be accessed at https://github.com/genomicsITER/PFgenetics.

Results

A total of 4,125 cases, 20,464 controls and 7,554,248 genetic variants were included in the analysis (Figure 1). The UUS study included one additional case (due to resolving a sample ID issue since the previous publication) and one fewer control (where the individual has since withdrawn consent from UK Biobank) than described in the previous GWAS2.

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: Study design and sample sizes

After conditional analyses, there were 23 independent signals with p<5×10−8 in the genome-wide meta-analysis (Figure 2). These 23 signals included all 14 associations reported in the previous GWAS (Supplementary Table 1). Of the nine novel genetic associations (Table 1), five showed evidence of replicability (PPR≥90%). The sentinel variants of these five loci included variants in introns of KNL1, NPRL3, STMN3 and RTEL1, and an intergenic variant in 10q25.1. All five novel variants had consistent direction of effect across all of the individual studies and reached nominal significance (p<0.05) in at least 3 of the studies.

Figure 2:
  • Download figure
  • Open in new tab
Figure 2: Manhattan plot.

Each point shows a genetic variant with chromosomal position on the x axis and the –log(p value) on the y axis. The grey dashed line shows the genome-wide significance level (p=5×10−8). Each signal is labelled with the gene implicated by that signal. Genes in grey are the novel loci that do show evidence of replicability. The plot has been truncated at p=10−30.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1: Sentinel variants of novel associations.

Novel variants are defined as those not reaching significance criteria in previous analysis2 (the RTEL1 and OBFC1 signals have previously shown a possible association – see discussion). Effect sizes and directions are given in terms of the allele that increases risk of IPF. Chr=Chromosome. Position is based on genetic build 37. Annotation obtained from Variant Effect Predictor7. EAF=Effect allele frequency calculated across the five studies. The “Direction” column shows the direction of the beta in each of the five individual studies (+ means beta>0, − means beta<0). The “Study p≤0.05” column denotes which individual studies the variant reached nominal significance in (Y means p≤0.05, N means p>0.05). Both the direction and study p<0.05 are given in the order UK, Colorado, Chicago, UUS and then Genentech. OR=Odds ratio. CI=Confidence interval. PPR=posterior probability of replicability calculated using MAMBA8. a The signal at KNL1 is independent of the previously reported nearby signal in the IVD gene. b The RTEL1 and STMN3 signals are independent of each other.

Discussion

By increasing the number of cases in the discovery analysis by more than 50% compared with the previous IPF risk GWAS, we identified novel genetic signals associated with IPF risk and improved the precision of estimations for previously reported signals. The five novel loci had internal evidence of replicability giving us confidence that these signals are likely to be generalisable.

The signals in RTEL1 and OBFC1 have been reported previously but did not meet the significance criteria of the previous three-way GWAS2. The new MAMBA analysis suggests that the consistency of effect across studies provides high confidence that the RTEL1 signal will replicate should an independent dataset become available. This is not the case for the OBFC1 signal where a low posterior probability of replication suggests that there may be heterogeneity in effect across the contributing studies.

The novel signals require further characterisation to determine the likely causal gene and underlying functional effect of the variants. However, some of the genes that are closest to these new signals have strong candidacy for involvement in IPF pathogenesis. NPRL3 encodes a GATOR1 complex function component and acts through mTORC1 signalling to inhibit mTOR kinase activity9. mTOR regulates TGF-β collagen synthesis and inhibiting mTOR leads to increased deposition of scar tissue10. We previously reported an association implicating DEPTOR, another mTOR inhibiting gene. We also add to the evidence that cellular ageing plays a key role in IPF pathogenesis through associations at the telomere maintenance genes TERT, TERC and RTEL1. We previously reported associations in spindle assembly genes (MAD1L1 and KIF15) and have identified a novel genetic association in another spindle assembly gene KNL1 (Kinetochore Scaffold 1 also known as CASC5). STMN3 (Stathmin 3) implicates another cell replication process through tubulin binding9.

By maximising the statistical power of the analysis, we identified novel genetic associations with IPF risk. These signals may implicate biologically relevant genes that support the importance of TGF-β signalling and cell replication as important processes in disease pathogenesis.

Data Availability

Genome-wide summary statistics can be accessed at https://github.com/genomicsITER/PFgenetics.

https://github.com/genomicsITER/PFgenetics

Ethics Statement

This research was conducted using previously published work with appropriate ethics approval. The PROFILE study (which provided samples for the UK and UUS studies) had institutional ethics approval at the University of Nottingham (NCT01134822 – ethics reference 10/H0402/2) and Royal Brompton and Harefield NHS Foundation Trust (NCT01110694 – ethics reference 10/H0720/12). Spanish samples were recruited under ethics approval by ethics committee from the Hospital Universitario N.S. de Candelaria (reference of the approval: PI-19/12). The UUS study also included individuals from clinical trials with ethics approval (ACE [NCT00957242] and PANTHER [NCT00650091]). UK samples were recruited across multiple sites with individual ethics approval (University of Edinburgh Research Ethics Committee [The Edinburgh Lung Fibrosis Molecular Endotyping (ELFMEN) Study NCT04016181] 17/ES/0075, NRES Committee South West – Southmead, Yorkshire and Humber Research Ethics Committee 08/H1304/54 and Nottingham Research Ethics Committee 09/H0403/59). For individuals recruited at the University of Chicago, consenting patients with IPF who were prospectively enrolled in the institutional review board-approved ILD registry (IRB#14163A) were included. Individuals recruited at the University of Pittsburgh Medical Centre had ethics approval from the University of Pittsburgh Human Research Protection Office (reference STUDY20030223: Genetic Polymorphisms in IPF). Individuals from the COMET (NCT01071707) and Lung Tissue Research Consortium (NCT02988388) studies were also included in the Chicago study. All subjects in the Colorado study gave written informed consent as part of IRB-approved protocols for their recruitment at each site and the GWAS study was approved by the National Jewish Health IRB and Colorado Combined Institutional Review Boards (COMIRB). Subjects in the Genentech study provided written informed consent for whole-genome sequencing of their DNA. Ethical approval was provided as per the original clinical trials (INSPIRE [NCT00075998], RIFF [NCT01872689], CAPACITY [NCT00287729 and NCT00287716] and ASCEND [NCT01366209]). For the USCF cohort, sample and data collection were approved by the University of California San Francisco Committee on Human Research and all patients provided written informed consent. For the Vanderbilt cohort, the Institutional Review Boards from Vanderbilt University approved the study and all participants provided written informed consent before enrolment.

Conflicts of Interest and Funding

R Allen is an Action for Pulmonary Fibrosis Mike Bray Research Fellow. A Stockwell and B Yaspan are employees of Genentech/Roche and hold stock and stock options in Roche. J Oldham reports National Institute of Health/National Heart, Lung and Blood Institute grants R56HL158935 and K23HL138190 and personal fees from Boehringer Ingelheim, Genentech, United Therapeutics, AmMax Bio and Lupin pharmaceuticals unrelated to the submitted work. B Guillen-Guio is supported by Wellcome Trust grant 221680/Z/20/Z. G Jenkins is a trustee of Action for Pulmonary Fibrosis and reports personal fees from Astra Zeneca, Biogen, Boehringer Ingelheim, Bristol Myers Squibb, Chiesi, Daewoong, Galapagos, Galecto, GlaxoSmithKline, Heptares, NuMedii, PatientMPower, Pliant, Promedior, Redx, Resolution Therapeutics, Roche, Veracyte and Vicore. L Wain holds a GSK/British Lung Foundation Chair in Respiratory Research (C17-1). The research was partially supported by the National Institute for Health Research (NIHR) Leicester Biomedical Research Centre; the views expressed are those of the author(s) and not necessarily those of the National Health Service (NHS), the NIHR or the Department of Health. The UK and UUS studies selected controls from UK Biobank under application 648. This research used the SPECTRE High Performance Computing Facility at the University of Leicester.

Supplement

International IPF Genetics Consortium

Writing Group

Richard J Allen, Carlos Flores, Beatriz Guillen-Guio, R Gisli Jenkins, Imre Noth, Justin M Oldham, Amy Stockwell, Louise V Wain, Brian L Yaspan

UK Study

Richard J Allen, Helen L Booth, William A Fahy, Ian P Hall, Simon P Hart, Mike R Hill, Nik Hirani, Richard B Hubbard, R Gisli Jenkins, Toby M Maher, Robin J McAnulty, Ann B Millar, Philip L Molyneaux, Vidya Navaratnam, Eunice Oballa, Helen Parfrey, Gauri Saini, Ian Sayers, Martin D Tobin, Louise V Wain, Moira K B Whyte

Chicago Study

Ayodeji Adegunsoye, Carlos Flores, Naftali Kaminski, Shwu-Fan Ma, Imre Noth, Justin M Oldham, Mary E Strek, Yingze Zhang

Colorado Study

Tasha E Fingerlin, David A Schwartz

UUS Study

Richard J Allen, Carlos Flores, Beatriz Guillen-Guio, R Gisli Jenkins, Shwu-Fan Ma, Toby M Maher, Maria Molina-Molina, Philip L Molyneaux, Imre Noth, Justin M Oldham, Louise V Wain

Genentech study

Margaret Neighbors, Xuting Sheng, Amy Stockwell, Brian L Yaspan

View this table:
  • View inline
  • View popup
  • Download powerpoint
Supplementary Table 1: All sentinel variants associated with IPF risk

This table includes the most associated variant (sentinel) for the 19 signals (14 previously reported loci and the five novel loci identified here) associated with IPF risk. The risk allele is the allele associated with increased risk of IPF. Position is for genetic build 37. Chr=chromosome. EAF=Effect allele frequency. OR=Odds ratio. CI=Confidence interval.

References

  1. 1.↵
    Lederer DJ, Martinez FJ. Idiopathic pulmonary fibrosis. N Engl J Med. 2018;378(19):1811–1823.
    OpenUrlCrossRefPubMed
  2. 2.↵
    Allen RJ, Guillen-Guio B, Oldham JM, et al. Genome-wide association study of susceptibility to idiopathic pulmonary fibrosis. American journal of respiratory and critical care medicine. 2020;201(5):564–574.
    OpenUrlPubMed
  3. 3.↵
    McCarthy S, Das S, Kretzschmar W, Durbin R, Abecasis G, Marchini J. A reference panel of 64,976 haplotypes for genotype imputation. bioRxiv. 2016:035170.
  4. 4.↵
    Willer CJ, Li Y, Abecasis GR. METAL: Fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010;26(17):2190–2191.
    OpenUrlCrossRefPubMedWeb of Science
  5. 5.↵
    Bulik-Sullivan BK, Loh P, Finucane HK, et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47(3):291–295.
    OpenUrlCrossRefPubMed
  6. 6.↵
    Yang J, Ferreira T, Morris AP, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet. 2012;44(4):369–75, S1-3.
    OpenUrlCrossRefPubMed
  7. 7.↵
    McLaren W, Gil L, Hunt SE, et al. The ensembl variant effect predictor. Genome Biol. 2016;17(1):122.
    OpenUrlCrossRefPubMed
  8. 8.↵
    McGuire D, Jiang Y, Liu M, et al. Model-based assessment of replicability for genome-wide association meta-analysis. Nature communications. 2021;12(1):1–14.
    OpenUrl
  9. 9.↵
    Stelzer G, Rosen R, Plaschkes I, et al. GeneCards – the human gene database. The GeneCards suite: From gene data mining to disease genome sequence analysis. Current protocols in bioinformatics. 2016(54):1.30.1-1.30.33.
  10. 10.↵
    Woodcock HV, Eley JD, Guillotin D, et al. The mTORC1/4E-BP1 axis represents a critical signaling node during fibrogenesis. Nature communications. 2019;10(1):6.
    OpenUrl
Back to top
PreviousNext
Posted December 07, 2021.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Genome-wide association study across five cohorts identifies five novel loci associated with idiopathic pulmonary fibrosis
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Genome-wide association study across five cohorts identifies five novel loci associated with idiopathic pulmonary fibrosis
Richard J Allen, Amy Stockwell, Justin M Oldham, Beatriz Guillen-Guio, Carlos Flores, Imre Noth, Brian L Yaspan, R Gisli Jenkins, Louise V Wain, International IPF Genetics Consortium
medRxiv 2021.12.06.21266509; doi: https://doi.org/10.1101/2021.12.06.21266509
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Genome-wide association study across five cohorts identifies five novel loci associated with idiopathic pulmonary fibrosis
Richard J Allen, Amy Stockwell, Justin M Oldham, Beatriz Guillen-Guio, Carlos Flores, Imre Noth, Brian L Yaspan, R Gisli Jenkins, Louise V Wain, International IPF Genetics Consortium
medRxiv 2021.12.06.21266509; doi: https://doi.org/10.1101/2021.12.06.21266509

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (228)
  • Allergy and Immunology (504)
  • Anesthesia (110)
  • Cardiovascular Medicine (1240)
  • Dentistry and Oral Medicine (206)
  • Dermatology (147)
  • Emergency Medicine (282)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (531)
  • Epidemiology (10023)
  • Forensic Medicine (5)
  • Gastroenterology (499)
  • Genetic and Genomic Medicine (2453)
  • Geriatric Medicine (238)
  • Health Economics (479)
  • Health Informatics (1644)
  • Health Policy (753)
  • Health Systems and Quality Improvement (636)
  • Hematology (248)
  • HIV/AIDS (533)
  • Infectious Diseases (except HIV/AIDS) (11864)
  • Intensive Care and Critical Care Medicine (626)
  • Medical Education (252)
  • Medical Ethics (75)
  • Nephrology (268)
  • Neurology (2281)
  • Nursing (139)
  • Nutrition (352)
  • Obstetrics and Gynecology (454)
  • Occupational and Environmental Health (537)
  • Oncology (1245)
  • Ophthalmology (377)
  • Orthopedics (134)
  • Otolaryngology (226)
  • Pain Medicine (158)
  • Palliative Medicine (50)
  • Pathology (324)
  • Pediatrics (730)
  • Pharmacology and Therapeutics (313)
  • Primary Care Research (282)
  • Psychiatry and Clinical Psychology (2281)
  • Public and Global Health (4834)
  • Radiology and Imaging (837)
  • Rehabilitation Medicine and Physical Therapy (492)
  • Respiratory Medicine (651)
  • Rheumatology (285)
  • Sexual and Reproductive Health (238)
  • Sports Medicine (227)
  • Surgery (267)
  • Toxicology (44)
  • Transplantation (125)
  • Urology (99)