Abstract
Identification of somatic driver mutations in the noncoding genome remains challenging. To comprehensively characterize noncoding driver mutations for pancreatic ductal adenocarcinoma (PDAC), we first created genome-scale maps of accessible chromatin regions (ACRs) and histone modification marks (HMMs) in pancreatic cell lines and purified pancreatic acinar and duct cells. Integration with whole-genome mutation calls from 506 PDACs revealed 314 ACRs/HMMs significantly enriched with 3,614 noncoding somatic mutations (NCSMs). Functional assessment using massively parallel reporter assays (MPRA) identified 178 NCSMs impacting reporter activity (19.45% of those tested). Focused luciferase validation confirmed negative effects on gene regulatory activity for NCSMs near CDKN2A and ZFP36L2. For the latter, CRISPR interference (CRISPRi) further identified ZFP36L2 as a target gene (16.0 – 24.0% reduced expression, P = 0.023-0.0047) with disrupted KLF9 binding likely mediating the effect. Our integrative approach provides a catalog of potentially functional noncoding driver mutations and nominates ZFP36L2 as a PDAC driver gene.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The work was supported by the Intramural Research Program (IRP) of the Division of Cancer Epidemiology and Genetics (DCEG), National Cancer Institute (NCI), US National Institutes of Health (NIH).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
https://docs.icgc-argo.org https://ega-archive.org/studies/EGAS00001002543
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵* These authors jointly led this work.
Manuscript pdf file now with text and figures of the same size. No changes were made in text or any content.
Data availability
High-resolution promoter-focused Capture C data generated in tumor-derived pancreatic cell lines (PANC-1, MIA PaCa-2), ATAC-Seq and ChIP-Seq data from tumor-derived pancreatic cell lines (PANC-1, MIA PaCa-2, COLO357, KP-4, PaTu8988t and SU.86.86) and normal-derived pancreatic cell lines (HPDE-E6E7 and hTert-HPNE) have been deposited in SRA under accession number PRJNA1041452. The raw ATAC-Seq and ChIP-Seq sequencing data from purified pancreatic acinar and ductal cell populations were obtained from NCBI’s Gene Expression Omnibus (GSE79468). H3K27ac HiChIP-seq data from pancreatic acinar and duct cell populations is available through GEO (GSE245484). Raw whole genome and transcriptome sequence data for ICGC-PACA samples was obtained from The International Cancer Genome Consortium through controlled access (https://docs.icgc-argo.org). The whole genome somatic mutation calls and gene expression counts of ICGC-PACA samples were obtained from the International Cancer Genome Consortium (IGCG) data portal (https://dcc.icgc.org/) in July 2019 but now hosted at https://object.genomeinformatics.org (Bucket Name: icgc25k-open, a publicly available Object Storage Bucket, which uses the AWS S3 interface and is accessible using any S3 compatible object storage client). Raw whole genome and transcriptome sequence data, somatic mutation calls and gene expression counts from PanCuRx (European Genome-phenome Archive accession number EGAS00001002543) were kindly provided from the Ontario Institute for Cancer Research, Ontario, Canada (via controlled access).