Abstract
Background Recent research has identified a potential protective effect of higher numbers of circulating lymphocytes on colorectal cancer (CRC) development. However, the importance of different lymphocyte subtypes and activation states in CRC development and the biological pathways driving this relationship remain poorly understood and warrant further investigation. Specifically, CD4+ T cells – a highly dynamic lymphocyte subtype – undergo remodelling upon activation to induce the expression of genes critical for their effector function. Previous studies investigating their role in CRC risk have used bulk tissue, limiting our current understanding of the role of these cells to static, non-dynamic relationships only.
Methods Here, we combined two genetic epidemiological methods – Mendelian randomisation (MR) and genetic colocalisation – to evaluate evidence for causal relationships of gene expression on CRC risk across multiple CD4+ T cell subtypes and activation stage. Genetic proxies were obtained from single-cell transcriptomic data, allowing us to investigate the causal effect of expression of 1,805 genes across five CD4+ T cell activation states on CRC risk (78,473 cases; 107,143 controls). We repeated analyses stratified by CRC anatomical subsites and sex, and performed a sensitivity analysis to evaluate whether the observed effect estimates were likely to be CD4+ T cell-specific.
Results We identified six genes with evidence (FDR-P<0.05 in MR analyses and H4>0.8 in genetic colocalisation analyses) for a causal role of CD4+ T cell expression in CRC development – FADS2, FHL3, HLA-DRB1, HLA-DRB5, RPL28, and TMEM258. We observed differences in causal estimates of gene expression on CRC risk across different CD4+ T cell subtypes and activation timepoints, as well as CRC anatomical subsites and sex. However, our sensitivity analysis revealed that the genetic proxies used to instrument gene expression in CD4+ T cells also act as eQTLs in other tissues, highlighting the challenges of using genetic proxies to instrument tissue-specific expression changes.
Conclusions Our study demonstrates the importance of capturing the dynamic nature of CD4+ T cells in understanding disease risk, and prioritises genes for further investigation in cancer prevention research.
Competing Interest Statement
Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organization, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organization.
Funding Statement
BD is supported by a Wellcome Trust studentship (218495/Z/19/Z) at the University of Bristol. EH is supported by a Cancer Research UK Population Research Committee Studentship (C18281/A30905). BD, EH, and EV are supported by the CRUK Integrative Cancer Epidemiology Programme (C18281/A29019), and are part of the Medical Research Council Integrative Epidemiology Unit at the University of Bristol which is supported by the Medical Research Council (MC_UU_00032/03) and the University of Bristol. LJG is supported by a Cancer Research UK 25 (C18281/A29019) programme grant (the Integrative Cancer Epidemiology Programme). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript. Czech Republic CCS: This work was supported by the Grant Agency of the Ministry of Health of the Czech Republic (grants AZV NU22J–03–00033), and the project National Institute for Cancer Research (Programme EXCELES, ID Project No. LX22NPO5102) – Funded by the European Union – Next Generation EU. The Colon Cancer Family Registry (CCFR, www.coloncfr.org) is supported in part by funding from the National Cancer Institute (NCI), National Institutes of Health (NIH) (award U01 CA167551). Support for case ascertainment was provided in part from the Surveillance, Epidemiology, and End Results (SEER) Program and the following U.S. state cancer registries: AZ, CO, MN, NC, NH; and by the Victoria Cancer Registry (Australia) and Ontario Cancer Registry (Canada). Additional funding for CCFR GWAS analysis was as follows: The CCFR Set–1 (Illumina 1M/1M–Duo) and Set–2 (Illumina Omni1–Quad) scans were supported by NIH awards U01 CA122839 and R01 CA143247 (to GC). The CCFR Set–3 (Affymetrix Axiom CORECT Set array) was supported by NIH award U19 CA148107 and R01 CA81488 (to SBG). The CCFR Set–4 (Illumina OncoArray 600K SNP array) was supported by NIH award U19 CA148107 (to SBG) and by the Center for Inherited Disease Research (CIDR), which is funded by the NIH to the Johns Hopkins University, contract number HHSN268201200008I. The Colon CFR graciously thanks the generous contributions of their study participants, dedication of study staff, and the financial support from the U.S. National Cancer Institute, without which this important registry would not exist. The content of this manuscript is solely the responsibility of the authors and does not necessarily reflect the views or policies of the NIH or any of the collaborating centers in the CCFR, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government, any cancer registry, or the CCFR.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Most GWAS data were accessed through the GWAS Catalog or the original study data availability sections. The GWAS of sex-specific and site-specific colorectal cancer risk in European ancestries was made available to the researchers upon application. Sources of all other GWAS data are listed in the Data Availability section of this manuscript.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data can be found in the manuscript, in the supplementary information, or in the links provided in the references. The GWAS of overall CRC in European ancestries can be accessed using the GWAS catalogue (https://www.ebi.ac.uk/gwas/) accession no. GCST90129505. The data where the genetic instruments were extracted for all MR analyses are available on this link (https://trynkalab.sanger.ac.uk). All code used to carry out analyses has been made publicly available on GitHub (https://github.com/bennydeslandes/CD4-T_cell_CRC). Further information on the TwoSampleMR package and PWCoCo can be found on https://github.com/MRCIEU/TwoSampleMR/, and https://github.com/jwr-git/pwcoco, respectively.
List of abbreviations
- CI
- Confidence interval
- CRC
- Colorectal cancer
- eQTL
- Expression quantitative trait locus
- ER
- Endoplasmic reticulum
- FADS
- Fatty acid desaturase
- FDR
- False discovery rate
- FHL3
- Four and a Half LIM Domains 3
- GCTA-COJO
- Conditional and joint multi-SNP analysis
- GECCO
- Genetics and Epidemiology of Colorectal Cancer Consortium
- GTEx
- Genotype-Tissue Expression
- GWAS
- Genome-wide association study
- HLA
- Human leukocyte antigens
- IFN
- Interferon
- LA
- Lowly active
- LD
- Linkage disequilibrium
- MHC
- Major histocompatibility complex
- MR
- Mendelian randomisation
- mRNA
- Messenger ribonucleic acid
- nTreg
- Natural regulatory T cell
- OR
- Odds ratio
- PUFAs
- Polyunsaturated fatty acids
- PWCoCo
- PairWise Conditional and Colocalisation
- RPL28
- Ribosomal protein L28
- rsID
- Reference SNP cluster IDs
- SD
- Standard deviation
- SNP
- Single-nucleotide polymorphism
- TMEM258
- Transmembrane protein 258
- UPR
- Unfolded protein response
- WBC
- White blood cell