Somatically mutated genes in fatty liver disease have minimal influence on germline risk

Background: Understanding the genetics of liver disease has the potential to facilitate clinical risk stratification. We recently identified six genes and one lncRNA enriched for acquired somatic mutations in patients with NAFLD and alcohol-related liver disease. We hypothesised that germline variation in these genes would be associated with risk of liver disease development and contribute to prognostication. Methods: Genome-wide association study (GWAS) summary statistics were extracted from seven studies (>1.7 million participants) for variants near ACVR2A, ALB, CIDEB, FOXO1, GPAM, NEAT1 and TNRC6B for: aminotransferases, liver fat, HbA1c, diagnosis of NAFLD, ARLD, and cirrhosis. Findings were replicated using GWAS data from multiple independent cohorts. A phenome-wide association study was performed to examine for related metabolic traits, using both common and rare variants, including gene-burden testing. Results: There was no evidence of association between rare germline variants or SNPs near five genes (ACVR2A, ALB, CIDEB, FOXO1, and TNRC6B) and risk or severity of liver disease. Variants in GPAM were associated with liver fat (p=3.6x10-13), ALT (p=2.8x10-39), and serum lipid concentrations. Variants in NEAT1 demonstrated borderline significant associations with ALT (p=1.9x10-11) and HbA1c, but not with liver fat, as well as influencing waist-to-hip ratio, adjusted for BMI. Conclusions: Despite strong selective advantage to acquire somatic mutations at these loci, there was no evidence of an association between germline variation and markers of liver disease, except in GPAM. Polygenic risk scores based on germline variation alone will not capture prognostic data from genes affected by somatic mutations.


Introduction
Non-alcoholic fatty liver disease affects around 25% of the worldwide population and is emerging as the fastest growing form of liver disease in developed countries [1,2]. It encompasses a spectrum of disease from simple hepatic steatosis, through non-alcoholic steatohepatitis (NASH) to cirrhosis, with the attendant risks of liver failure and hepatocellular carcinoma (HCC). Given the number of patients at risk of development of NAFLD and subsequent health problems, there is an urgent need to stratify patients based upon long-term risks, so that even countries with wellresourced healthcare schemes can cope with the patient numbers predicted to develop end-stage liver disease over the next few decades.
At present, there are a number of clinically-deployed non-invasive tests, including elastography and serum biomarkers that can predict levels of hepatic fibrosis and long-term risk in NAFLD.
Significant interest has been generated in the potential of germline genetic variation in the pathogenesis and prognostication of NAFLD [3]. Single-nucleotide polymorphisms (SNPs) near several genes, including PNPLA3 [4], HSD17B13 [5], and TM6SF2 [6] have been shown to be associated with long-term risk of several disease-related outcomes in NAFLD [7], as well as in alcohol-related liver disease (ARLD) [8], through genome-wide association studies (GWAS).
These variants are common, but associated with small effect sizes, when studied in isolation. This has suggested the potential for combining multiple genotypes into a polygenic risk score: the analysis of several SNPs in patients at baseline as a predictive tool for long-term outcomes in both NAFLD [9] and ARLD [8].
Similarly, other studies have focussed on rare pathogenic germline variants, such as in APOB the gene that encodes apolipoprotein B, that lead to NAFLD with high penetrance [10]. Although these cases can inform us about underlying disease pathogenesis, their low frequency suggests that . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021. ; https://doi.org/10.1101/2021. 10.18.21265159 doi: medRxiv preprint they are unlikely to be informative in the stratification of most patients within the general NAFLD population.
We have recently identified recurrent somatic mutations within the liver of patients with NAFLD and ARLD [11]. Recurrent non-synonymous mutations were seen in 6 coding genes: ACVR2A, ALB, CIDEB, FOXO1, GPAM, TNRC6B, plus the long non-coding (lnc)RNA NEAT1. These mutations were found in multiple hepatocyte clones throughout the liver, with evidence of convergent evolution: the identification of independent clones with mutations in the same genes, suggesting strong selection pressure to acquire these variants in the context of fatty liver disease.
Amongst these genes, 3 are involved in lipid metabolism suggesting a potential functional role in disease pathogenesis: CIDEB, mediates fusion and cargo transfer of cytoplasmic lipid droplets [12]; FOXO1, the main transcription factor downstream of insulin [13], but also a major regulator of lipid metabolism [14] and; GPAM, encoding GPAT1, the enzyme catalyzing the initial step in triglyceride synthesis [15]. Through functional validation of the hot-spot mutations in FOXO1, we identified that this impacted upon the response to insulin, glycolysis and lipid metabolism. Although promising for improving the mechanistic understanding of disease pathogenesis, the prognostic or therapeutic role that these acquired mutations could play in NAFLD and ARLD remains unknown.
As there was strong selective pressure to acquire these mutations, we were interested to explore whether common or rare germline variation near these recurrently mutated genes might be associated with liver-related outcomes in NAFLD. We hypothesised that germline variation providing weaker modulation of disease phenotypes would exist and might improve the performance of polygenic risk scores being developed for prognostication in NAFLD and ARLD.
6 Amongst these genes, germline coding variants GPAM have been associated with serum ALT levels in an exome-wide association study of the UK Biobank cohort [16]. Further study found that these variants were also associated with hepatic fat content and histological markers of liver damage in independent NAFLD cohorts. Other studies using GWAS have replicated these findings in further cohorts [17][18][19], where rs10787429 in GPAM was associated with elevated ALT in both NAFLD and ARLD. Here we use data from multiple GWAS to investigate whether similar associations are observed for the other 5 recurrently mutated genes and one recurrently mutated locus.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Identification of genes enriched for somatic mutations in ARLD and NAFLD
As described in detail elsewhere [11], whole genome sequencing was performed on diseased, non-malignant hepatic tissue from patients with NAFLD and ARLD undergoing tumour resection or liver transplant. The dN/dScv method [20] was used to identify six coding genes with higher numbers of nonsynonymous mutations relative than expected.

Annotation of somatic mutants and genomic regions
After removal of duplicates, somatic mutations at ACVR2A, ALB, CIDEB, FOXO1, GPAM, and TNRC6B, plus one non-coding lncRNA NEAT1 [11] were annotated with predicted consequence, impact on transcript, and prevalence within the 1000 Genomes dataset [21] (n=2,504), Exome Sequencing Project [22] (n=6,503), and gnomAD [23] (n=141,456) using the Ensembl Variant Effect Predictor [24]. For the variants that had previously been identified in any of the above population genomic sequencing datasets, we searched Phenoscanner [25] and the Common Metabolic Disease Portal [26] for any evidence of association with metabolic traits. No data was available on Phenoscanner for these ultra-rare variants. The Common Metabolic Disease Portal search yielded 96 variant-trait associations therefore, rather than apply genome-wide association significance cut-off, the critical p-value for significance for this analysis only was p<5.2x10 -4 (i.e. 0.05 / 96).
In addition, we annotated each of the six genes with their expected and observed number of predicted loss of function (pLoF) and missense mutations from gnomAD [23]. In brief, the ratio between observed and expected pLoF mutants is an indicator of each gene's tolerance to haploinsufficiency. Such that a gene with a low observed / expected ratio (defined as the upper . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) In addition, for comparison, we extracted summary statistics for four well-validated genome-wide significance risk variants for fatty liver disease: rs738409C>G in PNPLA3 [4], rs58542926C>T in TM6SF2 [6], rs2642438A>G in MTARC1 [27], and rs72613567TA>T in HSD17B13 [5]. However, for some studies data on rs72613567TA>T was not available therefore we used rs13125522A>G as a proxy, which is in strong linkage disequilibrium (R 2 =0.97) with rs72613567 [28].
We also obtained summary statistics for diagnosis of NAFLD from Anstee et al. [30] and MRI liver fat from Liu et al. [31], which utilises UK BioBank data. These data were accessed through GWAS . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Catalogue [32]. For replication of findings, we first obtained categorical data on liver-related diagnoses from the FinnGen study: NAFLD, ARLD, hepatocellular carcinoma and intrahepatic cholangiocarcinoma, and cirrhosis. For replication of observations for ALT, we obtained data from BioBank Japan [33] and Pazoki et al. [34]. For replication of findings for HbA1c, we obtained summary statistics on HbA1c from the MAGIC consortium (trans-ancestry meta-analysis by Chen et al. [35]) and BioBank Japan, plus diagnosis of type 2 diabetes mellitus (T2DM) in East Asian individuals from Spracklen et al. [36]. The total number of unique participants included from these GWAS summary statistics was 1,628,945.
All variants from the above regions were extracted, and coordinates from FinnGen were carried over from GRCh38 to GRCh37 using the Ensembl Assembly Converter. Manhattan plots were produced for each trait, illustrating only variants within the regions of interest. Significance was defined as p<5x10 -8 . Within regions that had variants with significant association, we used FIVEx to look for expression quantitative trait loci (eQTL), which extracts data from the European Bioinformatic Institute eQTL Catalogue [37].

Gene-based phenome-wide association study for common variants
We sought to explore associations between germline variation in the coding genes and lncRNA of interest and metabolic traits using a phenome-wide association study approach.
Phenoscanner [25] and the Common Metabolic Diseases Knowledge Portal [26] were searched for each of the six coding genes plus NEAT1. Rare variants (mean allele frequency (MAF) <0.01) were excluded and results were filtered for traits relevant to ARLD and NAFLD. Significance was defined as p<5x10 -8 . Data on eQTLs was obtained using the Qtlizer package for R [38] for all significant associations.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) We also searched for any metabolite-wide associations within the regions of interest using data from Lotta et al. [39] (https://omicscience.org/apps/crossplatform/); however, we did not identify any significant (p<4.9x10 -10 , as defined by the authors) associations.
Association between rare coding variants and traits related to NAFLD or ARLD We next investigated whether rare coding variants individually, or combined using gene-burden analyses, were associated with markers of liver disease or related metabolic traits. We used data from https://genebass.org/ [40] and https://azphewas.com/ [41], which derive data from the UK BioBank 300k Exomes. These analyses were not available for the lncRNA NEAT1. Extracted associations for: individual variants and gene-burden tests for predicted loss of function (pLoF), missense, and synonymous variants. Significance, as defined by the original studies, was p<2.5x10 -8 for GeneBass (using SKAT-O test) and p<2.0x10 -9 (-log10(8.7)) for AZPheWAS.

Analyses
Data were analysed using R 4.0.2 [42] and the code used in analyses is available from https://doi.org/10.5281/zenodo.4656979.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021.

Results
We have recently identified recurrent somatic mutations in non-malignant liver tissue from individuals with ARLD and NAFLD through laser capture microdissection and whole genome sequencing [11]. Through this approach we identified six coding genes (ACVR2A, ALB, CIDEB, FOXO1, GPAM and TNRC6B) and one lncRNA (NEAT1) significantly enriched for acquired somatic variants. We hypothesised that given the selective advantage these variants must endow, rare pathogenic variants of these regions, associated with features of liver disease, might be identified.
Specific somatic mutations in recurrently mutated genes are not found in the germline We previously identified 129 unique variants across these six coding genes and one lncRNA (Supplementary Table 2), the majority of which were either missense (49/129, 38%) or noncoding exonic variants in NEAT1 (49/129, 38%), and predicted to have a moderate or high impact upon protein structure (61/129, 47%, Table 1). These variants are extremely rare in the germline, with 118/129 (92%) having never been identified across 150,463 individuals from gnomAD, 1000G, or the Exome Sequencing Project. The most common of these eleven previously reported variants was rs368997599 G>A (p.Arg45Trp) in CIDEB, which was identified in 10 individuals from gnomAD in the heterozygous state, 7 of whom are of Ashkenazi Jewish ancestry. Even in this genetic ancestry p.Arg45Trp in CIDEB remained an ultra-rare mutation with allele frequency = 7.0x10 -4 . There was no evidence of associations between any of the previously identified variants and metabolic traits in the Common Metabolic Disease Portal (Supplementary Table 3).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021. ; https://doi.org/10.1101/2021.10.18.21265159 doi: medRxiv preprint Whilst all of the 6 coding genes had the expected number of missense mutations, four of the six (ACVR2A, ALB, FOXO1, and TNRC6B) are under selection pressure to prevent against haploinsufficiency; in gnomAD there were significantly fewer predicted loss of function (pLoF) variants observed than expected ( Table 1). This implies that there are a reduced number of germline variants in these genes that will cause pLoF, compared to other coding genes and potential haploinsufficiency in these genes is deleterious. The exception to this was CIDEB, where there was an expected number of missense and pLoF mutations, but no reports of these rare variants associated with liver phenotypes. Overall, the specific somatic mutations that we had previously identified are very rare in the germline, due to negative selection pressure and not known to be associated with the development of liver disease.

Among recurrently mutated genes only germline exonic variation in GPAM
is associated with liver phenotypes Given the rarity (or absence) of these specific 129 variants in the germline, we investigated whether other rare variants at these loci were associated with liver disease (or related metabolic traits). We used data from https://genebass.org/ and https://azphewas.com/ (n=281,852 from UK BioBank), which tests whether exonic germline variants either individually, or cumulatively using a gene-based burden method, demonstrated an association with phenotypes.
Analysis of individual coding variants found that p.Ile43Val in GPAM was associated with differences in serum lipids and the synonymous mutation 10-112157327-T-A (p.Pro681Pro) in GPAM was associated with ALT ( Table 2). Exonic variants in other genes were associated with related metabolic traits, but not with liver disease phenotypes ( Table 2 & Supplementary Table   4). Using gene-based burden testing, which adds together all variants within a single category (e.g. pLoF or missense), variants in ALB demonstrated associations with serum lipids, but no other markers of liver disease. Burden testing for pLoF variants in TNRC6B demonstrated a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021. ; https://doi.org/10.1101/2021.10.18.21265159 doi: medRxiv preprint significant association with alcohol consumption habits, but no other markers of alcohol-related disease (Supplementary Table 4). Therefore, amongst the 6 recurrently mutated genes only rare germline coding variants in GPAM have been identified to be associated with serum lipid levels, rather than liver phenotypes.
Germline variation at GPAM, but not other somatically mutated genes, is associated with liver phenotypes We next explored whether more common germline variants might be associated with liver phenotypes in previously published datasets. We utilised the summary statistics from the UK BioBank cohort of 420,000 subjects [29] and searched for variants within these seven regions (Figures 1-2, Table 3, & Supplementary Figure 1). As previously described [16,17], we observed genome-wide significant variants within GPAM, associated with elevated serum levels of ALT (Table 2, e.g. rs10787429 C>T beta=.006, p=2.8x10 -39 ), AST, and liver fat by MR imaging (e.g. rs11446981 T>TA beta=-.003, p=3.6x10 -13 ). There were no significant associations between SNPs near any of the other recurrently mutated coding genes and disease correlates in NAFLD or ARLD.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021. ; https://doi.org/10.1101/2021. 10.18.21265159 doi: medRxiv preprint Germline variation at the long non-coding RNA NEAT1 is associated with liver phenotypes and glycaemic control In the UK BioBank cohort we observed genome-wide significant variants within the lncRNA NEAT1 for elevated serum ALT (Figure 1). This was replicated in the European ancestry ALT meta-analysis, though not in the BioBank Japan ALT results (Figure 2). The lead variant for NEAT1 (rs595366 T>A) was also found to have an eQTL for NEAT1 in both adipose and liver tissue.
However, no associations were identified between variants at NEAT1 and categorical definitions of liver disease (e.g. diagnosis of NAFLD or ARLD, Figures 1-2) or severity of liver disease (e.g. cirrhosis). This was consistent across data from the UK BioBank and FinnGen (Supplementary Given the causal associations between insulin resistance and hepatic steatosis, we investigated whether variants were associated with HbA1c or a diagnosis of type 2 diabetes mellitus (T2DM).
Variants in NEAT1 were associated with HbA1c levels in the UK BioBank (e.g. rs34743766 C>CA, beta=0.1, p=8.5x10 -9 ) and had an associated eQTL for NEAT1 in adipose tissue, but this was not replicated for diagnosis of T2DM or other analyses of HbA1c (Supplementary Figure 1). These associations are interesting as expression levels of NEAT1 have previously been associated with both the presence of diabetes [43] and the progression of diabetic complications [43,44].

Germline variation at recurrently mutated genes is associated with metabolic phenotypes
We then explored whether common variants in or near these regions were associated with other related metabolic traits that affect patients with NAFLD, using a gene-(or region-) based . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)  Table 7) . In addition to its association with ALT, variants in or near GPAM influenced the concentration of many serum lipids (Table 4), including LDL cholesterol and triglyceride levels. We also identified variants near NEAT1 that were associated with a diagnosis of T2DM, waist-to-hip ratio and had a significant eQTL for NEAT1 in adipose tissue. In addition, variants in or near TNRC6B were associated with BMI and creatinine, as were those in ACVR2A. Therefore, germline variation in these genes may impact on related phenotypes that accompany the metabolic syndrome.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Discussion
Investigating the genetics of fatty liver disease has the potential to inform our understanding of disease biology and facilitate clinical risk stratification for affected patients [3]. We recently identified six protein-coding genes and one lncRNA (NEAT1) that are enriched for somatic mutations in patients with NAFLD or ARLD [11]. In this study we find that germline variation in only one, GPAM, was robustly associated with markers of liver disease. Therefore, genetic risk scores capturing only germline variation will not include the pathogenicity conveyed by these regions and additional strategies may be needed to understand the prognostic implications of these somatic mutations.
For five (ACVR2A, ALB, CIDEB, FOXO1, TNRC6B) of the six protein-coding genes under investigation in this study, we found no evidence for association between germline variation and liver disease. These observations were consistent across multiple data sources, traits, genetic ancestries, and analysis methodologies, for example both common variant analyses and rarevariant gene burden testing. This was an unexpected observation, given the genetic evidence for their selective advantage in hepatocyte clones in diseased non-malignant NAFLD and ARLD [11].
Moreover, CIDEB and FOXO1 have well-established functions as a lipid droplet-associated protein [12] and a component of the insulin signalling cascade [14], respectively. Conversely, our previous study did not identify acquired somatic mutations in genes with strong germline associations with liver disease (e.g. PNPLA3, TM6SF2). Collectively, this suggests that the influence of germline and acquired variants on parenchymal liver disease occurs through independent mechanisms (and genes).
It should be noted that the methodology employed in this study does not exclude the possibility that individuals with rare loss of function mutations in these genes may have liver-related phenotypes. Identification and studying human knock-outs for these genes is an alternative . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021. ; https://doi.org/10.1101/2021.10.18.21265159 doi: medRxiv preprint strategy for investigating whether germline variation plays a role in liver disease, as has been illustrated for other conditions [45,46]. However, our data suggest that these individuals will be very rare.
Our analysis has replicated the known associations between variants in or near GPAM and liver fat, as well as ALT and AST. Whilst we did not show these variants to influence diagnosis of NAFLD, this has been demonstrated by others [18] with larger sample sizes, variants in GPAM (particularly p.Ile43Val) are associated with radiological and histological diagnosis of NAFLD [16][17][18][19]. However, It is important to note that variants in GPAM have a comparatively small effect size compared to variants in PNPLA3 and TM6SF2, but similar to those in HSD17B13 and MTARC1.
We identified borderline associations between variants in NEAT1, a long non-coding RNA (lncRNA), with ALT and HbA1c. Our more broad analyses implicated NEAT1 in influencing multiple metabolic traits (e.g. serum triglycerides, diagnosis of T2DM, and coronary artery disease), that are of potential relevance to patients with NAFLD. These results point towards a primary role on insulin resistance, potentially through modulation of adipose tissue biology, as several variants in this lncRNA also had significant eQTLs in adipose tissue. The biology of NEAT1 is poorly understood, but there is some in vitro evidence for its role in adipogenesis [47].
We suggest that the subtle effects of germline variants in NEAT1 on ALT are likely indirect, via perturbation of insulin resistance and/or development of T2DM [48], however further work is required to establish this. This data also underlines the principle that the genomic regions enriched for somatic mutations in our original analysis are principally those involved in metabolism.
Clinically, one aim of human genetics is to stratify patients into high-and low-risk groups for disease progression using polygenic gene scores. Such an approach can identify individuals with . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021. ; https://doi.org/10.1101/2021.10.18.21265159 doi: medRxiv preprint a five-fold increased risk of coronary artery disease [49]. This would be of particular use for NAFLD and ARLD, both common conditions where only a minority of individuals progress to liver-related clinical events. To date there have been four PGS published for liver disease [8,9,50,51], all derived using genome-wide significant hits, and therefore none of our seven genomic regions of interest were included. If a genome-wide PGS were derived [52], which included weighting from sub-genome wide-significant variants, then variants in or near GPAM would contribute. However, they would still receive comparatively minimal weighting compared to variants in PNPLA3 and TM6SF2. More broadly, it is not clear how the magnitude of prognostic implication would compare for germline variation risk scores compared to somatic mutations, as the prognostic implication of these remains unknown. Our results illustrate that integration of somatic mutants into prognostic tools will be a complex process and separate from existing methods for polygenic gene scores.
To understand the role that somatically mutated genes could play in germline PGS there is a need for more independent cohorts with liver-directed exomic or targeted sequencing. This will facilitate PGS derivation and validation against hard clinical outcomes (e.g. liver transplantation) to formally test whether variants in or near ACVR2A, ALB, CIDEB, FOXO1, and TCNR6B are prognostic.

Conclusion
Out of seven genomic regions with strong evidence for playing a role in NAFLD and ARLD through somatic mutations, only germline variation in GPAM is associated with liver disease. Other genes analysed, including the lncRNA NEAT1, have a role in metabolic pathways Therefore, genes with pathogenic somatic mutations in NAFLD and ARLD are distinct to those that confer germline risk and would not be captured by polygenic risk scores. Novel approaches will be required to integrate somatic and germline variation with clinical variables for risk prediction algorithms.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted October 19, 2021

Table 4: Associations between common variants in or near regions of interest and related metabolic traits. Gene-based PheWAS performed using Phenoscanner and Common Metabolic
Disease Portal to identify common variants significantly associated with the six protein-coding genes and one lncRNA (NEAT1) of interest. eQTL data were obtained using the Qtlizer package for R.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.       Table 4: Associations between common variants in or near regions of interest and related metabolic traits. Gene-based PheWAS performed using Phenoscanner and Common Metabolic Disease Portal to identify common variants significantly associated with the six protein-coding genes and one lncRNA (NEAT1) of interest. eQTL column provides the gene that have significant eQTLs for that variant, the tissues, and the direction (+, positive eQTL; -, negative eQTL). eQTL data were obtained using the Qtlizer package for R.