ABSTRACT
Background Copy-number variations (CNVs) have been associated with rare and debilitating genomic syndromes but their impact on health later in life in the general population remains poorly described.
Methods Assessing four modes of CNV action, we performed genome-wide association scans (GWASs) between the copy-number of CNV-proxy probes and 60 curated ICD-10 based clinical diagnoses in 331,522 unrelated white UK Biobank participants with replication in the Estonian Biobank.
Results We identified 73 signals involving 40 diseases, all of which indicating that CNVs increased disease risk and caused earlier onset. Even after correcting for these signals, a higher CNV burden increased risk for 18 disorders, mainly through the number of deleted genes, suggesting a polygenic CNV architecture. Number and identity of genes disturbed by CNVs affected their pathogenicity, with many associations being supported by colocalization with both common and rare single nucleotide variant association signals. Dissection of association signals provided insights into the epidemiology of known gene-disease pairs (e.g., deletions in BRCA1 and LDLR increased risk for ovarian cancer and ischemic heart disease, respectively), clarified dosage mechanisms of action (e.g., both increased and decreased dosage of 17q12 impacts renal health), and identified putative causal genes (e.g., ABCC6 for kidney stones). Characterization of the pleiotropic pathological consequences of recurrent CNVs at 15q13, 16p13.11, 16p12.2, and 22q11.2 in adulthood indicated variable expressivity of these regions and the involvement of multiple genes.
Conclusions Our results shed light on the prominent role of CNVs in determining common disease susceptibility within the general population and provide actionable insights allowing to anticipate later-onset comorbidities in carriers of recurrent CNVs.
Competing Interest Statement
Sven Ojavee is an employee of MSD at the time of the submission; contribution to the research occurred during the affiliation at the University of Lausanne.
Funding Statement
The study was funded by the Swiss National Science Foundation (31003A_182632, Alexandre Reymond; 310030_189147, Zoltán Kutalik), Horizon2020 Twinning projects (ePerMed 692145, Alexandre Reymond), the Estonian Research Council (PRG687, Maarja Jõeloo and Reedik Mägi), and the Department of Computational Biology (Zoltán Kutalik) and the Center for Integrative Genomics (Alexandre Reymond) from the University of Lausanne.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
UK Biobank (UKBB) data were accessed through application #16389 and all participants signed a broad informed consent form. Estonian Biobank (EstBB) activities are regulated by the Human Genes Research Act, which was adopted in 2000 specifically for the operations of the EstBB. Analyses were carried out under ethical approval 1.1-12/624 from the Estonian Committee on Bioethics and Human Research (Estonian Ministry of Social Affairs), using data according to release application 3-10/GI/34668 [20/12/2022]. All participants signed a broad informed consent form.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
Data Availability
All data used in this study are publicly available, as described in the methods. CNV-GWAS summary statistics (UKBB) will be deposited on the GWAS Catalog upon publication and are available upon request until then.