Introduction

Angiotensin-converting enzyme inhibitor drugs (ACEi) are frequently used in clinical medicine for the treatment of elevated blood pressure, heart failure and renal protection in chronic kidney disease.1 Although this class of medications is generally well tolerated, adverse reactions may prevent their use in some individuals. The most common side effect is a persistent, nonproductive cough that can start within days to months after initiating therapy and requires cessation of ACEi use.2 Women are 1.5–2 times more likely to develop a cough than men.3 Some epidemiological features of ACEi-induced cough suggest a genetic predisposition to this adverse effect. In particular, there is racial variation in the incidence of cough, with the highest rates observed in east Asian populations where rates are 20–45%,4 as compared with European American populations where rates are approximately 10%.5

The mechanism of ACEi-induced cough is not certain. ACEis block the proteolytic enzyme angiotensin I-converting enzyme (ACE), which cleaves a number of target proteins, including angiotensin I, the primary mediator of the blood pressure-lowering effects, and pro-inflammatory kinins.6 One suspected mechanism of cough is the accumulation of these inflammatory kinins, which may sensitize vagal afferent fibers leading to a neurogenically mediated cough.7 There have been multiple candidate gene studies testing for associations between genetic variation in ACE and bradykinin pathway members and ACEi-induced cough. There has been intense focus on the association of cough with a common 287-bp insertion–deletion in the ACE gene.3, 8, 9, 10, 11 A meta-analysis of 11 of these studies only found a consistent association in a subgroup analysis of subjects aged >60 years.12 Significant single-nucleotide polymorphism (SNP) associations have also been reported between mediators in the bradykinin pathway. including the bradykinin B2 receptor (BDKRB2), membrane metallo-endopeptidase (MME), prostaglandin receptor E (PTGER3), neurokinin 2 receptor (NK2R) and ACE.9, 13, 14, 15, 16 However, these findings have not been consistently observed across studies.3, 10, 17, 18 Additional studies also suggest a role for SNPs located within the ABO gene, which have been shown to regulate the plasma ACE levels and have also been associated with ACEi cough.10, 14, 19, 20

In the present study, we used a genome-wide association study (GWAS) approach to carry out a comprehensive analysis of genetic determinants of ACEi-induced cough. Subjects were identified through the Electronic Medical Records and Genomics (eMERGE) network, a consortium of medical centers that utilize electronic medical records (EMR) as a tool for genomic research.21 Candidate SNPs were first identified using a multi-racial discovery cohort. The most significant candidate SNPs were then evaluated in two independent replication cohorts. We report here a significant signal at the KCNIP4 locus indicating that variant potassium channel function or neuronal signaling contributes to the risk of a cough.

Materials and methods

Study population

The discovery study population comprised 7080 adult subjects collected from six sites participating in the phases I and II of eMERGE Network (Phase I: Vanderbilt University (VUMC), Marshfield Clinic, Northwestern University, Mayo Clinic and Group Health Research Institute; and Phase II: Geisinger Health System and Mount Sinai).21, 22 Subjects used for the discovery population were selected to maximize the representation across sites while minimizing the number of genotyping platforms represented. In addition, cases and controls genotyped on OMNI-QUAD platforms and that were part of the Vanderbilt Electronic Systems for Pharmacogenomic Assessment (VESPA) study that examined the genomics of drug response phenotypes were also included.23 For subgroup analyses, genetic ancestry assignment was determined using STRUCTURE24 in conjunction with 1917 ancestry informative markers (from the Illumina Test Panel25), with European ancestry defined as >90% probability of being in the CEU cluster and African ancestry defined as >70% probability of being in the YRI cluster, using a HapMap population as the reference (Supplementary Figures S1–S3). Site-specific subject counts are shown in Supplementary Table S1.

Phenotype data

The phenotype evaluated was cough attributable to the use of an ACEi. The algorithm identifying cases and controls was developed at VUMC and subsequently validated and deployed at the other eMERGE sites. The phenotype definition incorporated an iterative process whereby automated case and control assignment algorithms were validated against assignments made by manual review of the EMR.26 Phenotyping algorithms were refined until the positive predictive value reached the predesignated target of 95%. In the final algorithms, cases were defined as subjects whose records contained either an ACEi drug name or ACEi class designator and ‘cough’ on the same line within the structured ‘Allergy’ section of the medical record. Hence, cases represent ACEi-induced cough recorded by a health-care provider. Controls were subjects who had an ACEi drug name or ACEi class on two medication listings with dates separated by at least 6 months and did not have a documented cough associated with ACEi use in the Allergy section. Complete details of the algorithm are available from PheKB (http://phekb.org/phenotypes). The mean positive predictive value for cases and controls in the VUMC and replication sites was 100% and 97.5%, respectively (Supplementary Table S2).

Birth decade and sex were also extracted in addition to diagnosis codes (International Classification of Diseases (ICD)-9 codes27) related to comorbidities potentially contributive to the risk of cough, including asthma, postnasal drip, gastroesophageal reflux disease, bronchitis, emphysema, bronchiectasis, allergic alveolitis and chronic obstructive pulmonary disease. Lists of the ICD-9 codes used to define each of these conditions are shown in Supplementary Table S3. Subjects with 1 ICD-9 codes for any of the conditions were considered to have the comorbidity. Smoking status, categorized as ‘Ever’, ‘Never’ or ‘Missing’, was also extracted from either structured EMR data or validated algorithms.28

Genotyping data

SNP genotype data were acquired on the Illumina HumanOmni1-Quad (Vanderbilt), HumanOmni5-Quad (VUMC), Human1M-Duov3_B (VUMC), HumanOmniExpress-12v1.0 (Geisinger), HumanOmniExpress (Mount Sinai) and Human660W-Quadv1_A (all other sites and VUMC) (Supplementary Table S1). Genotyping data for each platform were individually cleaned. Quality control (QC) steps included identifying sex mismatches, SNPs failing concordance with HapMap, Mendelian errors and duplicate removal. After QC, a merged data set was created that contained 267 485 SNPs present on all platforms and with a call rate >98%. Cryptic relatedness was assessed on the merged platforms by identical-by-descent analysis, and one of a pair of subjects (n=343 total) more closely related than half-siblings was randomly excluded. Imputation was performed on the merged intersection data set using IMPUTE229 in conjunction with the 1000 Genomes phase 3 reference panel for all populations. Prior to imputation, strand alignment between study and reference genotypes and prephasing of study genotypes into haplotypes was performed using SHAPEIT.30 Only those imputed SNPs with a genotype probability >90% were analyzed. SNPs with an Info score29 (measuring the average probabilities for a given SNP) <0.7 were excluded from analyses. The final analyses were restricted to 1 931 830 SNPs with a call rate >98%, a Hardy–Weinberg P>1 × 10−6 and minor allele frequency (MAF)>0.01. Principal components (PCs) fit to the preimputed SNP data set were computed using EIGENSTRAT31 to adjust for population structure.

The primary single SNP tests of association were performed using logistic regression assuming an additive genetic model, adjusting for 10 PCs, birth year and sex. A model that also incorporated binary covariates for smoking status (captured by two binary variables: ‘Ever smoked’ and ‘Never smoked’) and each of the comorbidities of asthma, postnatal drip, gastroesophageal reflux disease and lung disease (bronchitis, emphysema, bronchiectasis, allergic alveolitis and chronic obstructive pulmonary disease) was evaluated.

Replication cohorts

SNPs with an association P-value <5x10−6 were evaluated in two replication populations. The first replication set comprised additional subjects available through the eMERGE network who had SNP genotyping as part of eMERGE Phase II in addition to 53 additional cases and controls from VUMC that were not included in the original analyses. Covariate and phenotype extraction from the EMR was as described above. Subjects were genotyped on the Affymetrix Human SNP Array 6.0 (Mount Sinai, Marshfield), Human660W-Quad (Marshfield), Human610-Quad (Mayo) and HumanOmniExpress (Group Health, Northwestern) platforms (Supplementary Table S4). QC steps, as described above, were performed per the QC protocol established by the eMERGE Genomics Working Group.32 Each data set was then individually imputed by platform using IMPUTE229 and the 1000 Genomes cosmopolitan reference panel.33 Imputed data were extracted and merged using the same QC filters as described above. The replication analyses were conducted in 157 cases and 769 controls that had >80% probability of being in the CEU cluster by STRUCTURE analyses. Multivariable analyses adjusting for PCs and covariates were performed, as described above. A replication P-value of 0.05 was considered statistically significant.

The second replication set was derived from the GoDARTS (Genetics of Diabetes Audit and Research in Tayside, Scotland) 2011 cohort and contained prescription information between 1990 and 2011 for 17 601 self-reported Caucasian diabetics and non-diabetics from the Tayside area. Cases were defined as subjects who switched from an ACEi to an angiotensin receptor blocker, a drug that targets a downstream receptor in the angiotensin–renin–aldosterone pathway. The positive predictive values of a possible and probable ACEi adverse drug reaction using this case definition were 90.5% and 68.3%, respectively. The most frequent adverse reactions associated with this case definition were cough and the rare reaction of angioedema. Controls were defined as those who had filled ACEi prescriptions within 9 months of the study’s censor date or their date of death. Subjects who were ever concurrently on ACEis and angiotensin receptor blockers were also categorized as controls. A total of 710 cases and 3599 controls were available for analysis. Subjects were genotyped on the Affymetrix 6.0 (Affymetrix, Santa Clara, CA, USA) (381 cases and 1924 controls) or Illumina HumanOmniExpress (Illumina, San Diego, CA, USA) (329 cases and 1675 controls) platforms. Both platforms were imputed using IMPUTE229 and the 1000 Genomes reference panel. SNPs deviating from Hardy–Weinberg equation (P<1 × 10−6) or with an Info Score<0.4 were excluded. An additive genetic model adjusting for age and sex was first computed separately for each genotyping platform and the results were meta-analyzed using GWAMA.34

Data analysis

All QC analyses and GWA analyses were performed using PLINK v1.07.35 GoDARTS data were also analyzed using SNPTEST v2.536 and GWAMA v2.1.34 All other analyses were performed using SAS v9.3 (SAS Institute, Cary, NC, USA). SNP data around the KCNIP4 gene were visualized using LocusZoom37. The forest plot was generated using the Metafor package.38 The publicly available GTEx39, HaploReg v340 and the NCBI expression quantitative trait locus (eQTL) databases were used to identify eQTLs and functional motifs associated with the most significant candidate SNPs.

Ethics statement

The eMERGE study was approved by the Institutional Review Board at each site.21, 41 The GoDARTS study was approved by the Tayside Medical Ethics Committee and informed consent was obtained for all participants.

Results

A total of 1595 cases of ACEi-induced cough and 5485 controls were analyzed in the discovery cohort (Table 1). The majority of subjects were of European ancestry, with about 15% of subjects belonging to other racial/ethnic groups. As compared with controls, the cases had a higher proportion of females (P<0.001), were younger, as measured by birth decade (P<0.001), more likely to have been smokers (P<0.001) and differed with respect to frequencies of diagnoses of gastroesophageal reflux disease (P<0.001), postnatal drip (P<0.001) and structural lung disease (P<0.001), all of which may contribute to a chronic cough. However, the directions of associations for these variables were not consistent, as smoking and lung disease were associated with decreased rates of ACEi cough, whereas the other variables were associated with increased rates.

Table 1 Population characteristics of the discovery cohort

In multivariable regression analyses adjusting for PCs, sex and birth year, two SNPs located on chromosome 4 reached genome-wide significance (Figure 1, Supplementary Figure S4). All of the SNPs with an association P-value<5x10−6 were exclusively located in intron 4 in the gene ‘Kv Channel Interacting Protein 4’ (KCNIP4) (Figure 2 and Supplementary Figure S5). The strongest association was in the imputed SNP rs145489027 (MAF=0.30, odds ratio (OR)=1.3 (95% confidence interval (CI): 1.2–1.4), P=1.0 × 10−8; Table 2). When results were stratified by site, there was a consistent direction of association across sites, with ORs ranging from 1.1 to 1.5 (Supplementary Table S5). When the analyses were additionally adjusted for cough risk factors, the association at the SNP remained significant (OR=1.3 (95% CI: 1.2–1.4), P=1.0 × 10−8; Supplementary Figure S6). Similar results were observed when subjects with lung disease were excluded (OR=1.3 (95% CI: 1.2–1.5), P=3.0x10−8; Supplementary Figure S7). A regression model adjusting for the rs145489027 SNP eliminated the associations (P>0.05) for the other most significant SNPs, indicating that the SNPs represent a common signal (the r2 between this SNP and other most significant SNPs was 0.68 for rs7661530 and rs6838116 and 0.98 for rs7675300, rs16870989 and rs1495509 in a 1000G European ancestry population). In a subset analysis of African Americans and European Americans, this SNP was most strongly associated with cough in European Americans (MAF=0.33, OR=1.3 (95% CI: 1.2–1.4), P=2.5 × 10−7; Table 3). Although the direction of the effect was similar in African Americans, these associations were not significant in this smaller subset of subjects (n=879). ACEi-induced cough has been observed to occur more frequently in women3. In analyses stratified by sex, ORs were similar for men (OR=1.3 (95% CI: 1.2–1.5), P=1.4 × 10−4) and women (OR=1.3 (95% CI: 1.1–1.5), P=1.1 × 10−5). A search of publicly available data sets did not identify eQTLs or functional motifs associated the most significant SNPs. We specifically examined the associations using all SNPs located in 16 previously reported candidate genes (see Supplementary Materials). No significant associations (P<0.05) were observed after applying a Bonferroni correction adjusting for multiple testing or in a subset analysis of older subjects born before 1960.

Figure 1
figure 1

Manhattan plot of genotyped single-nucleotide polymorphisms associated with angiotensin-converting enzyme inhibitor-induced cough using an additive model adjusted for 10 principal components, age, sex, history of asthma, reflux, postnasal drip and lung disease. The red line indicates the genome-wide significance threshold of alpha=5 × 10−8.

PowerPoint slide

Figure 2
figure 2

LocusZoom plot of most strongly associated single-nucleotide polymorphisms (SNPs) from genome-wide association study located in the region of KCNIP4 (chr4:20330238–22350374), centered around SNP rs145489027 (shown in purple). Linkage disequilibrium (based on r2 values) with respect to rs145489027 are based on the CEU reference population. Imputed SNPs are denoted by squares and genotyped SNPs by circles.

PowerPoint slide

Table 2 Most significantly associated SNPs
Table 3 Association analyses by racial groups for SNP rs145489027

We sought replication for the SNPs in KCNIP4 with an association P-value <5 × 10−6 in two independent data sets (Supplementary Tables S6 and S7). Of the six SNPs identified in the discovery cohort, five were available in each of the final QC’ed eMERGE and GoDARTS replication sets. In both replication sets, the MAFs and association statistics were similar to those observed in the European ancestry subjects in the discovery population (Table 2). The most significantly associated SNP from the discovery analysis (rs145489027) did not reach statistical significance in either replication population (Table 2). One SNP (rs7675300) was significantly associated with ACEi cough in the eMERGE set, (OR=1.32 (1.01–1.70), P=0.04). Two SNPs were significantly associated in the GoDARTS set: rs16870989 (OR=1.15 (1.01–1.30), P=0.03) and rs1495509 (OR=1.15 (1.01–1.30), P=0.03). In a meta-analysis performed using both replication sets, three of the four SNPs present in both sets had an association at P<0.05 (rs16870989: OR=1.17 (1.03–1.34), P=0.009; rs145489027: OR=1.15 (1.01–1.31), P=0.02); and rs1495509: OR=1.17 (1.03–1.34), P=0.009); Supplementary Table S8). In a meta-analysis across all discovery and replication populations, four of the most significant SNPs reached genome-wide significance (Table 2) with the non-imputed SNP rs1495509 (highlighted in Figure 2), showing the strongest association (Figure 3). The combined association statistic across the replication sets at this SNP was (OR=1.23 (1.15–1.32), P=1.9 × 10−9) and did not demonstrate significant heterogeneity across studies (P=0.42).

Figure 3
figure 3

Forest plot for the association between angiotensin-converting enzyme inhibitor cough and KCNIP4 intronic single-nucleotide polymorphism rs1495509 for the discovery (eMERGE (Electronic Medical Records and Genomics) Network) and replication (eMERGE and GoDARTS (Genetics of Diabetes Audit and Research in Tayside, Scotland)) data sets. There was no heterogeneity across studies (P=0.43). CI, confidence interval.

PowerPoint slide

Discussion

We describe the first large GWAS investigating SNP variants associated with clinically diagnosed ACEi-associated cough. All phenotype and genotype data for this study were derived from clinical research settings that incorporate an EMR data system. We found significant associations in a set of intronic SNPs located within the gene KCNIP4. Several of these significant associations were independently replicated in two European ancestry populations. In summary, these analyses identify a novel candidate gene that may have a role in this common adverse reaction.

The SNPs showing the strongest association with the cough phenotype are located exclusively in intron 4 of KCNIP4. KCNIP4 is a member of the KChIP family of EF hand (helix-loop-helix)-containing calcium-binding proteins. A major function attributed to KCNIP4 is the regulation of Kv4 potassium channels, which are significant contributors to action potential activity in neurons and cardiac myocytes.42 The amino terminus of the KCNIP4 gene product undergoes extensive alternative splicing and at least six splice variants have been described.43 Of note, several of these splice variants remove the flanking exons surrounding the intronic region identified in these analyses. Alternative splicing has been shown to have functional significance and can result in modulation of Kv4 channel functions and their subcellular location.42, 44 KCNIP4 isoforms are predominantly expressed in neuronal structures in the brain and spinal cord, though some isoforms are found in the kidney, stomach and small intestine.43 Little-to-no expression of the gene has been observed in lung extracts.45, 46

The most strongly associated SNPs are located within a single intron of KCNIP4. These SNPs are not in linkage disequilibrium with an amino-acid changing variant that would alter the primary structure of the protein. Hence, it is more likely that these SNPs have a regulatory role likely related to mRNA splicing or expression. A leading hypothesized mechanism of the ACEi-induced cough is stimulation of sensory nerve afferents within the lung resulting from the accumulation of inflammatory mediators that are normally cleaved by the ACE enzyme.47 This hypothesis has served as the basis for candidate gene studies that have focused on variation in inflammatory pathways within the lung. Our results would suggest that the important source of this variation may be directly related to the sensory nerves themselves, as KCNIP4 has been found to be in both central and peripheral neuronal structures. Indeed, if KCNIP4 expression in the lung were restricted to sensory nerves, its protein and mRNA levels would be expected to be low in samples derived from lung whole-cell homogenates, especially if samples were taken from the lung periphery. This restricted expression pattern could account for why KCNIP4 has not been detected in the lung. In further support of a role for KCNIP4 in lung physiology, a GWAS in mice identified an association between KCNIP4 and airway hyper-responsiveness, which was confirmed in studies of human asthma and airway hyper-reactivity.48

Epidemiological studies have identified several factors associated with an increased risk of ACEi-induced cough, including sex and race. In particular, the prevalence of cough is higher among women and east Asian populations.3, 4 In our replication set, the allele frequencies and effect sizes in subjects of European ancestry were generally comparable to those observed in the discovery sets. The ORs were weaker in the GoDARTS replication set, which is likely attributable to the lower positive predictive value and specificity of the case definition (switching from an ACEi medication to an angiotensin receptor blocker) used in this cohort.49, 50 Among African Americans, the associated alleles had lower MAFs. Although the association statistics trended in the same direction, they did not reach statistical significance. However, there was only 5% power to replicate the association in this group owing to small numbers of subjects and lower allele frequencies. Analyses across races suggest that this genomic region around KCNIP4 might be a trans-population risk factor and allele frequencies may contribute to prevalence differences in ACEi-induced cough. However, a comparison of the MAFs for the most significantly associated SNPs among HapMap European and Asian populations show that the minor alleles are very similar in these groups. Hence, racial differences in the frequencies of these SNPs would not account for the prevalence differences between these populations and would suggest that other independent or modifying genetic factors may be contributing to racial differences.

There are several limitations to this study. Cases and controls were identified using the EMR data sources. A limitation of EMR data is that data collection is not systematic and can be incomplete. These limitations can lead to both differential and nondifferential misclassification, which can skew or weaken associations. For instance, subjects who had been switched from ACEi therapy due to cough prior to the time period captured by the EMR could be inappropriately assigned to the control group. In addition, information pertaining to ACEi dosage, treatment duration and indication could not be systematically extracted for analysis. Hence, the contributions of these factors to ACEi cough could not be evaluated. We were also unable to incorporate elements of ACEi cough phenotype definitions that have been used in some studies, such as evaluating that effect of cessation of an ACEi on cough at a fixed time interval,18 as these protocols are not standard in clinical practice. The study also had limited power to detect associations for SNPs with an MAF<10%. Hence, the contribution of low frequency variants to the phenotype could not be quantified. Genotyping was performed using a number of commercial SNP-genotyping platforms. Imputation across multiple genotyping platforms can give rise to systematic frequency differences, which can lead to inflated type I error in GWAS studies.51 This problem is exacerbated when there are case and control imbalances across platforms. To attenuate this bias, the genotype data in the discovery set were imputed from an intersection of SNPs across each platform being evaluated, an approach that has been shown to decrease type I error rates.52 We did not demonstrate a functional role for KCNIP4 in the cough phenotype. Hence, it is possible the SNPs we identified are involved in the regulation of a nearby gene unrelated to KCNIP4, as has been reported for a number of other SNPs located within or in close proximity of a gene.53, 54 Epidemiological studies have observed higher rates of ACEi-induced cough in some Asian populations. This study did not assess the risks of the candidate SNPs in this population.

In conclusion, we used a GWAS to identify SNP variants associated with ACEi-induced cough using data derived from eMERGE, a network of medical centers that utilize electronic medical records as a tool for genomic research. We identified SNPs in the intron of the gene KCNIP4 as potential candidates for this adverse reaction. The mechanisms by which KCNIP4 may contribute to the cough are not known, and functional studies are needed to further elucidate the pathophysiological mechanisms.