Genome‐wide analysis identifies gallstone‐susceptibility loci including genes regulating gastrointestinal motility

Abstract Background and Aims Genome‐wide association studies (GWAS) have identified several risk loci for gallstone disease. As with most polygenic traits, it is likely that many genetic determinants are undiscovered. The aim of this study was to identify genetic variants that represent new targets for gallstone research and treatment. Approach and Results We performed a GWAS of 28,627 gallstone cases and 348,373 controls in the UK Biobank, replicated findings in a Scottish cohort (1089 cases, 5228 controls), and conducted a GWA meta‐analysis (43,639 cases, 506,798 controls) with the FinnGen cohort. We assessed pathway enrichment using gene‐based then gene‐set analysis and tissue expression of identified genes in Genotype‐Tissue Expression project data. We constructed a polygenic risk score (PRS) and evaluated phenotypic traits associated with the score. Seventy‐five risk loci were identified (p < 5 × 10−8), of which 46 were new. Pathway enrichment revealed associations with lipid homeostasis, glucuronidation, phospholipid metabolism, and gastrointestinal motility. Anoctamin 1 (ANO1) and transmembrane Protein 147 (TMEM147), both in novel, replicated loci, are expressed in the gallbladder and gastrointestinal tract. Both regulate gastrointestinal motility. The gallstone risk allele rs7599‐A leads to suppression of hepatic TMEM147 expression, suggesting that the protein protects against gallstone formation. The highest decile of the PRS demonstrated a 6‐fold increased odds of gallstones compared with the lowest decile. The PRS was strongly associated with increased body mass index, serum liver enzymes, and C‐reactive protein concentrations, and decreased lipoprotein cholesterol concentrations. Conclusions This GWAS demonstrates the polygenic nature of gallstone risk and identifies 46 novel susceptibility loci. We implicate genes influencing gastrointestinal motility in the pathogenesis of gallstones. image


INTRODUCTION
Gallstone disease is one of the most common reasons for acute presentation to hospitals worldwide [1] and generates the greatest health care expenditure of any gastrointestinal condition in the United States. [2] Prevalence of gallstones reaches 20%-40% in high-risk populations, [3,4] and gallstones are highly heritable. [5] Identification of novel pathways influencing gallstone development may improve risk stratification in high-risk populations and support development of medications targeting gallstone formation and other cholestatic disease for use in primary and secondary prevention.
A total of seven genome-wide association studies (GWAS) have been published, revealing 29 gallstonepromoting loci. [6][7][8][9][10][11][12] Most gallstones are composed of cholesterol, which crystallizes in bile at high concentration and/or in bile lacking protective bile acids. [13] Many of the identified variants relate to cholesterol and bile acid metabolism. Despite evidence of impaired gallbladder and gastrointestinal motility contributing to the formation of gallstones, there have been no gallstonesusceptibility variants identified that influence these characteristics. [14] Using larger and more richly phenotyped cohorts may identify such variants and contribute to the knowledge of gallstone formation.
The most recent GWAS and GWAS meta-analysis used data from the Global Biobank Engine (GBE) [15] or the UK Biobank (UKB) data. [16] Both analyses relied on three-level International Classification of Diseases, 10th Revision (ICD-10) and International Classification of Diseases, Ninth Revision (ICD-9) codes and considered gallstones to be code "K80" ("cholelithiasis") or self-reported gallstones at study enrolment. The UKB captures several other informative variables including OPCS codes, other ICD-9 and 10 codes, self-reported procedures (validated in medical records) relating to gallstones, and primary care data. By incorporating these additional data, a greater proportion of patients with gallstone disease can be classified as cases rather than misclassified as controls.
The aim of this study was to perform an updated GWAS using more richly phenotyped participants to identify pathways contributing to gallstone formation. The genetic associations identified were replicated in the Generation Scotland: Scottish Family Health Study (GS-SFHS) cohort [17] and meta-analyzed with FinnGen summary association statistics. [18] A polygenic risk score (PRS) was derived, and association of the PRS with intermediate biomarkers and anthropometric traits were assessed.

MATERI ALS AND METHODS
Methods for identification of participants, determination of gallstone status, and genotyping for the UKB, GS-SFHS, and the FinnGen cohort are provided in Supporting Materials 1. Diagnostic codes included ICD-9, ICD-10, OPCS3, OPCS4, Read codes (primary care), and UKB self-reported codes. Self-reported codes were validated retrospectively in medical notes. All individuals with diagnostic codes documenting gallstones and individuals who underwent procedures for treatment of 210752/Z/18/Z, 216767/Z/19/Z and 219542/Z/19/Z; Medical Research Council, Grant/Award Number: MR/ T008008/1 and U. MC_UU_00007/10 pathway enrichment using gene-based then gene-set analysis and tissue expression of identified genes in Genotype-Tissue Expression project data.
We constructed a polygenic risk score (PRS) and evaluated phenotypic traits associated with the score. Seventy-five risk loci were identified (p < 5 × 10 −8 ), of which 46 were new. Pathway enrichment revealed associations with lipid homeostasis, glucuronidation, phospholipid metabolism, and gastrointestinal motility. Anoctamin 1 (ANO1) and transmembrane Protein 147 (TMEM147), both in novel, replicated loci, are expressed in the gallbladder and gastrointestinal tract. Both regulate gastrointestinal motility. The gallstone risk allele rs7599-A leads to suppression of hepatic TMEM147 expression, suggesting that the protein protects against gallstone formation. The highest decile of the PRS demonstrated a 6-fold increased odds of gallstones compared with the lowest decile. The PRS was strongly associated with increased body mass index, serum liver enzymes, and C-reactive protein concentrations, and decreased lipoprotein cholesterol concentrations.

Conclusions:
This GWAS demonstrates the polygenic nature of gallstone risk and identifies 46 novel susceptibility loci. We implicate genes influencing gastrointestinal motility in the pathogenesis of gallstones. gallstones were considered as cases. Individuals without a code specifically for gallstones who required cholecystectomy were also considered as cases unless an alternative pathology (e.g., gallbladder polyp) was also documented, in which case the individual was excluded. Controls were individuals with no gallstone-related diagnostic or treatment code. Relevant diagnostic codes are provided in Tables S1-S3.
The UKB received ethical approval (Research Ethics Committee [REC] reference number: 11/NW/0382). UKB data access was approved under projects 30439 (phenotype data) and 19655 (genotype data). Ethical approval for GS-SFHS was granted by NHS Tayside REC (reference number: 05/S1401/89). The FinnGen study protocol was approved by the Ethical Review Board of the Hospital District of Helsinki and Uusimaa (Nr HUS/990/2017).

Initial genome-wide analysis
Genotyped and imputed single nucleotide polymorphisms (SNPs) were analyzed. Participants of selfreported European ancestry were included, and outliers for heterozygosity and unexpected runs of homozygosity were excluded. One randomly selected participant from each pair of related individuals in the UKB (Kinship > 0.0884 [19] ) was excluded.
Association with gallstones was analyzed using logistic regression adjusted for age, sex, the first 20 genetic principal components, and batch, with batch included as a random effect. Imputation dosage was used for imputed SNPs.
SNPs were excluded based on minor allele frequency (MAF < 0.001), imputation quality (information quality metric for imputation of genotype [INFO] < 0.3), and departure from Hardy-Weinberg equilibrium (p < 5 × 10 −6 ). The MAF was selected on the basis of ensuring 25 or more minor alleles in the cases (MAF = 25/(2 × n_cases)), which corresponded to a MAF of 0.0004. This was rounded up to 0.001.
Genome-wide significance was determined as p < 5 × 10 −8 in the UKB and Bonferroni-corrected P C = 0.05 in the replication cohorts (0.05 divided by the number of loci tested). Replication was undertaken by analyzing the lead SNP within each locus or a linkage disequilibrium (LD) proxy.

Genome-wide association meta-analysis
A genome-wide association meta-analysis was undertaken using UKB data and FinnGen data in an inverse variance-weighted fixed effects model. SNPs shared between the data sets (10,044,759) were analyzed. RsID was converted to the format "rs1234:A:T" to avoid ambiguity at multi-allelic SNPs.

LD clumping and conditional analyses
LD clumping to select independent (non-correlated) SNPs was performed using the Functional Mapping and Annotation of GWAS (FUMA) web application version 1.3.5d. [20] Loci were established for each lead SNP with a minimum distance of 250 Kb between loci and using an r 2 < 0.25 to indicate separate independent SNPs within the same locus. Details of the FUMA parameters are available in Supporting Materials 2A. Five loci with rare lead SNPs not present in the 1000 Genomes Phase 3 reference panel (MAF UKB ∼ 0.01 − 0.001) were generated manually by creating a 500-Kb window centered on the lead SNP.
Each locus was re-analyzed, while conditioning on the lead SNP and further signals with genome-wide significance were identified. This process was repeated with additional lead SNPs until no remaining SNPs reached genome-wide significance.

Sensitivity analyses
Sensitivity analyses, assessment of population stratification, and assessment for gene-metabolite interaction are provided in Supporting Materials 1.

Polygenic risk score
A PRS was derived using a random 80%/20% split of the UKB cohort. A separate GWAS was conducted in the 80%, and the summary associations statistics used to derive the PRS which was validated in the 20% split using PRSice. [21] This was conducted as an additive model with default PRSice parameters (independent variants selected when at least 250 Kb apart with an r 2 < 0.1) on directly genotyped SNPs. Associations of the PRS with serum biochemistry values were assessed in an age-adjusted and sexadjusted linear regression model. Total cholesterol (TC), HDL cholesterol, LDL cholesterol, triglycerides, apolipoprotein A and B (ApoA and ApoB), lipoprotein A (LipoA), alkaline phosphatase (ALP), alanine aminotransferase, aspartate aminotransferase, gammaglutamyl transferase, serum bilirubin (conjugated, unconjugated, and total), hemoglobin, red blood cells, reticulocyte percentage, glucose, glycated hemoglobin A1c (HbA1c), C-reactive protein (CRP), and cystatin C were assessed. The association with body mass index (BMI), hip circumference, and waist gallstones ∼ age + sex + (1|batch) + pc1 + pc2+. . . + pc20 + genotype circumference were also assessed. Each trait was assessed visually via histogram and log-transformed in the case of skewed distribution. Separately, to identify locus-specific effects on intermediate traits, each lithogenic allele from the UKB GWAS was investigated for relationship with each of these traits in an age-adjusted and sex-adjusted linear regression.

Pathway enrichment
We conducted a gene-based analysis in MAGMA to identify genes and gene sets showing significant association with gallstones. Methods for gene-based and gene-set analysis have been described previously [22] ; full details of the methods are given in Supporting Materials 1.

Plotting and statistical analysis of results
Genomic analyses were performed using SNPtest (version 2.5.4 for gene-metabolite interaction model; version 2.5.2 for all other analyses) and QCTOOLS (version 2.0.6) [23] within a Linux high-performance compute cluster. GWA meta-analysis was undertaken in METAL. [24] Post-GWAS analysis, regression analyses, and plotting were performed using R version 3.6.3. [25] Methods and results are reported in accordance with the STREGA guidelines [26] (Supporting Materials 3).

Gallstones cohort
A total of 502,616 participants entered the UKB study, of whom 377,000 were taken forward for GWAS. A history of gallstones was present in 28,627 with 348,373 controls. The median follow-up available was 9.0 years (range 7.3-11.9). Baseline demographics of the UKB cohort are given in Table S4. A total of 24,096 participants entered GS-SFHS, of whom 6317 unrelated individuals were taken forward for GWAS. A history of gallstones was present in 1089 with 5228 controls. The mean follow-up available was 10.7 years (range 9.6-14.7) (Figure 1).

Initial UKB analysis
A total of 4065 SNPs were identified with genomewide significant p values after removal of low-quality SNPs (see Supporting Materials 4). An additional 11,539 SNPs were significant when specifying a higher threshold for significance (p < 5 × 10 −5 ). LD clumping revealed 50 loci with at least one significant signal (see Supporting Materials 2B). Five additional loci contained variants with a significant association with gallstones for which the lead SNP was not present in the 1000 Genomes reference panel (MAF 0.01 to 0.001) and were therefore not annotated by the FUMA application. LD clumping for those loci was performed using boundaries of 250 Kb from the lead SNP.
In the UKB, 27 of 55 loci are newly identified variants, 7 of 55 are known from previous GWAS using other data sets, and 21 of 55 variants were previously identified only in studies using UKB data. [10,12] Of the new variants, 11 of 27 were replicated in FinnGen, 10 of 27 were not replicated on analysis in either FinnGen or GS-SFHS, and 5 of 27 rare variants and 1 of 27 indel did not have a suitable LD proxy available in either FinnGen or GS-SFHS.
The D19H variant of ATP-binding cassette subfamily G member 8 (ABCG8; rs11887534) demonstrated a highly significant result (p < 4.94 × 10 −324 ). Twentythree adjacent variants demonstrated identical p values and almost identical β coefficients. As rs11887534 is a known protein-altering variant (D19H) and previously identified as the lead SNP by other GWAS, this was considered as the lead SNP.
GWA meta-analysis GWA meta-analysis with FinnGen data revealed a total of 63 loci associated with gallstone disease (p < 5 × 10 −8 ). Of these, 43 of 63 were identified in the UKB GWAS, 1 of 63 was previously identified with UKB data, [10] and 19 of 63 were novel loci identified through the meta-analysis.
The total number of novel gallstone-susceptibility loci identified by either the UKB GWAS or the meta-analysis is 46, and the total number of gallstone-susceptibility loci is 75. Of the novel loci, 11 of 46 were replicated, 19 of 46 were identified through meta-analysis but require external replication, and 16 of 46 were identified only in the UKB and also require replication. Of the 29 previously reported loci, all 29 were identified in the meta-analysis. All loci are provided in Table 1. RsIDs, allele frequency, imputation quality, and chromosome position for all three cohorts and the meta-analysis are provided in Supporting Materials 4.
Manhattan plots for the UKB GWAS and the metaanalysis are shown in Figure S1 and Figure 2.

Conditional analyses
Conditional analyses, adjusting for the lead SNP, were undertaken for the 55 loci identified in the UKB GWAS.
After conditional analyses, a total of six loci were found to have additional independent signal, suggesting that more than one variant within these loci has a causal relationship with gallstones. The remaining 49 loci did not have any additional SNPs reaching genome-wide significance (p < 5 × 10 −8 ).
The ABCG8 locus (rs11887534) contained 10 variants with independent and significant association with gallstones within ABCG8, ABCG5, and dynein cytoplasmic 2 light intermediate chain 1 (DYNC2LI1). The transmembrane 4 L six family member 4 (TM4SF4) locus (rs4681515) contained two variants; the ABCB4 locus (rs7802555) contained three variants; the solute carrier family 10 member 2 (SLC10A2 locus) (rs144846334) contained three variants; the forkhead box A3 (FOXA3) locus (rs34255979) contained two variants; and the trinucleotide repeat containing adaptor 6B (TNRC6B) locus (rs11089985) contained two variants with independent and significant association with gallstones. Full results of the conditional analyses are provided in Supporting Materials 4. This study did not detect a second independent signal previously identified at the locus containing tetratricopeptide repeat domain 39B (TTC39B), [9] although a more stringent genome-wide significance threshold was used in this study (p = 5 × 10 −8 versus p = 6.49 × 10 −4 ).       a APOE has previously been analyzed in candidate gene studies and a meta-analysis, which found no significant association (PMID 31200656).

Population stratification
After adjusting for the first 20 genetic principal components, there was evidence of test statistic inflation (λ gc = 1.103). Test statistic inflation is seen in traits caused by multiple genes (polygenicity) but can also be a feature of unmeasured population substructure, in which genetically similar individuals also share similar environmental exposures. [27] LD score regression determines the contribution of polygenicity, to test statistic inflation. The LD score regression intercept was 1.042, and the proportion of test statistic inflation ascribed to causes other than polygenicity was estimated to be 10.9%-17.2%, confirming that polygenicity is the main driver of test statistic inflation. The value of λ gc was similar to the GBE GWAS. [12] The quantile-quantile plot for the UKB is shown in Figure S2.

Sensitivity analysis
A sensitivity analysis was conducted in the UKB for each of the 55 lead SNPs using 23,422 individuals undergoing surgery for gallstones, with 348,373 controls with no record of gallstones. Forty-four loci retained significance at the genome-wide threshold. Ten retained a significant association when using the higher threshold for significance (p < 5 × 10 −5 ; alpha 1-3-N-acetyl-galactos-aminyltransferase and alpha

Gene-metabolite interaction
Gene-metabolite interaction of SNPs with serum unconjugated bilirubin was undertaken at 54 of the lead

PRS
The PRS was significantly associated with gallstone disease in the validation cohort (maximum variance explained on the liability scale R 2 = 2.5%, p = 2.62 × 10 −181 , when using a GWAS p-value threshold of 0.0001). There was a 6-fold increase in the odds of gallstones in those with the highest decile of genetic risk compared with those in the lowest (OR 5.97, 95% CI 5.60-6.36) and a 2.77-fold increased odds for those in the top decile compared with the remainder of the cohort (OR 2.77, 95% CI 2.69-2.86). There was an approximately linear increase in the odds of gallstones with each increasing PRS decile (Figure 3). The PRS was strongly associated with increased BMI, waist circumference, hip circumference, liver enzymes (other than ALP), cystatin C, and CRP (greater genetically predicted odds of gallstones resulted in higher values for those traits). The PRS was associated with reductions in cholesterol, HDL, LDL, ApoB, and ALP. Generally, for all traits with which the PRS was significantly associated, there was an approximately linear increase (or decrease) with each increase in PRS decile and no apparent threshold effect. There were no obvious differences in levels of triglycerides, ApoA, LipoA, blood count markers, HbA1c, or glucose ( Figures S4-S6).
Analyses at each of the 55 lithogenic variants from the UKB were undertaken in all 377,000 individuals taken forward for GWAS. The lithogenic alleles were heterogeneous in their influence on serum lipids, with some lithogenic alleles promoting hyperlipidemia, while others demonstrated a protective effect. The lithogenic variants of ABCG8 and HNF4A (rs11887534-C and rs1800961-T, respectively) both resulted in reductions of cholesterol of 0.

Pathway enrichment
A total of 172 genes showed significant association with gallstones (p < 2.59 × 10 −6 ; Supporting Materials 4) and were highly expressed in liver (p = 4.09 × 10 −6 ; gallbladder expression was not available in MAGMA). Thirteen gene sets were identified as significantly associated with gallstones ( Table 2). Many of these gene sets relate to glucoronidation, lipid metabolism, and phospholipid transport.
Literature search identified two genes governing gastrointestinal motility (anoctamin 1 [ANO1] and transmembrane protein 147 [TMEM147]). Rs7599 lies within the 3′UTR region, and the lithogenic allele (A) is associated with significantly reduced expression of TMEM147 in the liver (p = 3.0 × 10 −47 ; GTEx). Data on effects of rs56363382 and all other significant SNPs in the ANO1 locus were not available from GTEx. ANO1 and TMEM147 are highly expressed in gallbladder smooth muscle, fibroblasts, and glandular cells, with 113.7 and 71.4 transcripts per million protein coding genes (Human Protein Atlas). Five identified genes were involved in primary cilia function: DYNC2LI1, TBC1D32, ADAMTS20, polycystin 2 like 1 (PKD2L1), and proteome of centriole protein 1B (POC1B).

DISCUSSION
We report the largest and most detailed GWA of gallstones conducted worldwide with functional annotation of variants and gene-set analysis. We analyzed 377,000 individuals (28,627 cases and 348,373 controls) from the UKB and replicated findings in a Scottish cohort (GS-SFHS) and a Finnish Cohort (FinnGen).

F I G U R E 3
Odds ratio plot demonstrating odds of gallstones by decile of the polygenic risk score. Each decile of the risk score is compared with the lowest risk group (decile 1) Our analysis identified 46 new lithogenic loci and replicated all 29 previously identified loci, bringing the total number of identified gallstone-susceptibility loci to 75. Through pathway enrichment we identified 13 gene sets that are strongly associated with gallstones. Through FUMA annotation we identified two pathways that may influence gallstone development: gastrointestinal motility and ciliogenesis. Our results highlight important pathways that may be targeted by novel therapies and form the basis for further research into gallstones and other cholestatic disease.
Two newly identified and replicated loci are highly expressed in the gallbladder and govern gastrointestinal motility. ANO1 (also known as TMEM16A) is a calcium-activated chloride channel that is widely expressed in the gastrointestinal tract, with roles in automaticity and contraction of the smooth muscle cells via the interstitial cells of Cajal. [28] Administration of ANO1 inhibitors (TMinh-23) in murine models impedes gastric emptying with improved oral but not parenteral glucose tolerance, supporting the role of ANO1 as a gastric motility promoter. [28,29] ANO1 is also expressed in the biliary epithelium, where it has another role in secretion of fluid from the apical cholangiocyte membrane in response to bile acids, suggesting that it may promote increased bile flow. [30] ANO1 may influence gallstone formation via gallbladder contractility and biliary flow. Given the possible impact on gallbladder motility and biliary composition, we suggest that any trials of TMinh-23 assess for development of gallstones as an important safety consideration. TMEM147 is known to inhibit transfer of M3 muscarinic receptors to the cell surface. [31] M3 receptors are highly expressed in smooth muscle cells of the gallbladder and may influence gallbladder contractility and biliary stasis. [32] The variants may also influence gallstone formation through increased colonic transit time, which is also associated with gallstones and believed to alter enterohepatic bile acid circulation. [14] Five loci relate to function of primary cilia. TBC1D32 controls ciliogenesis within the neural tube and interacts with the sonic hedgehog protein, which has wider roles in organogenesis and axonal guidance. [33] The metalloproteinase encoded by ADAMTS20 also plays a pleiotropic role in ciliogenesis and hedgehog signaling, with mutant variants causing ciliopathies. [34] POC1B is essential for formation of the centriole and ciliogenesis during cell division. [35] PKD2L1 is involved in regulating calcium concentration within the primary cilia, [36] and disorders of calcium signaling of the primary cilia are thought to result in cholestasis. [37] Three independent signals within the ABCG8 locus were intronic or 3′-UTR variants of DYNC2LI1. Although these independent signals may represent LD with additional causal variants within ABCG8, DYNC2LI1 has a role in ciliogenesis with mutations causing severe ciliopathies. [38] It is possible that both ABCG8 and DYNC2LI1 play separate roles in gallstone formation from within the same locus. Variants within the sonic hedgehog gene and DYNC2LI1 T A B L E 2 MAGMA pathway analysis: significant gene sets have previously been associated with gallbladder cancer and gallstone formation, [39] whereas TBC1D32 and ADAMTS20 (novel loci, not replicated) and PKD2L1 and POC1B (identified in the meta-analysis) are novel loci. We suggest that abnormalities in cholangiocyte ciliogenesis and ciliary function may contribute to biliary stasis with gallstone formation. Abnormal expression or function of cholangiocyte cilia causes other cholestatic disease and malignancies. [40] MAGMA analysis identified 13 gene sets. Most were related to lipid homeostasis and glucoronidation. Gene sets for Waldenström's macroglobulinemia and multidrug-resistance pathways were enriched with enrichment driven by ABCB4 and ABCB1. ABCB4 encodes a phosphatidylcholine flippase, which effluxes phosphatidylcholine into bile. [41] Phosphatidylcholine acts as a solvent for biliary cholesterol. ABCB4 mutations are a cause of early-onset gallstones and lowphospholipid-associated cholelithiasis. [42] ABCB1 is a neighboring gene involved in phospholipid and drug excretion with broad specificity. [41] The pathway enrichments may be a feature of pleiotropy, as these genes have several distinct functions. However, serum IgG and IgM extracted from patients with Waldenström's macroglobulinemia results in nucleation of biliary cholesterol, possibly representing an alternate mechanism directly linking gallstones with Waldenström's macroglobulinemia. [43] Based on our results, we suggest the following priorities for further research: (1) direct genotyping at all 75 lead SNPs in individuals with corresponding cholecystectomy specimens for biochemical analysis of biliary content, gallbladder epithelial protein expression, and cholangiocyte protein expression; (2) assessment of gallbladder emptying with hepatobiliary scintigraphy in patients genotyped at ANO1 and TMEM147, to assess the influence of the lithogenic variants on gallbladder motility; (3) assessment of gallstone risk in murine models with pharmacological modification of gallbladder motility (TMinh-23) or ciliotherapy; and (4) validation of our findings in a large cohort in which radiological evaluation of gallstone disease has been undertaken.
The strengths of this study include a more complete definition of gallstones than in previous studies, which reduces the rate of case misclassification. Notably, previous studies have identified up to 18,417 cases, whereas we identified 28,627 from the UKB. We use three large cohorts with replication of most novel findings. Many of the loci we identified have plausible relationships to gallstones, which we explore in pathway analysis, tissue expression analyses, and through assessing the impact of genetically elevated gallstone risk on serum biomarkers. The limitations of this study include misclassification of individuals with asymptomatic gallstones that have not been detected clinically. Relying on clinical disease as the definition of gallstone disease has been common to all earlier GWAS of gallstones, although our selection of relevant codes has improved our sensitivity relative to other studies. The phenotypic characterization of participants in the UKB is limited to less-invasive procedures such as venepuncture, and no data on biliary composition or pathological records of gallstone composition following cholecystectomy were available. This impedes the interpretation of cholesterol-influencing SNPs, as SNPs may lead to reduced serum cholesterol through biliary excretion, increased serum cholesterol through intestinal absorption, or a balanced combination of both with no net change in serum cholesterol. All three scenarios may lead to similar changes in biliary cholesterol yet demonstrate opposing effects on serum cholesterol. Our study has focused entirely on individuals of European ancestry, and our findings require replication in cohorts of other ethnicities. Although our findings and research implications may be applicable to other ethnicities, it is possible that some risk loci may interact with differing environmental exposures or genetic exposures in other populations. Finally, the definition of gallstones in the FinnGen cohort was not identical to the UKB, as we used the published summary association statistics from the FinnGen GWAS. However, most UKB cases would have met the definition used in FinnGen.
In summary, we have performed a GWAS of gallstone disease using UKB data, performed a metaanalysis with the FinnGen cohort, and identified 46 novel variants associated with gallstone disease. We identify the gallstone-susceptibility variants that may influence gastrointestinal motility, and we identify key priorities for research into gallstone formation and preventative treatments.

C O N F L I C T O F I N T E R E S T
Nothing to report.