Abstract
Background Kidney renal clear cell carcinoma (KIRC) has the highest invasion, mortality and metastasis of the renal cell carcinomas and seriously affects patients’ quality of life. However, the composition of the immune microenvironment and regulatory mechanisms at transcriptomic level such as ceRNA of KIRC are still unclear.
Methods We constructed a ceRNA network associated with KIRC by analyzing the long noncoding RNA (lncRNA), miRNA and mRNA expression data of 506 tumor tissue samples and 71 normal adjacent tissue samples downloaded from the Cancer Genome Atlas (TCGA) database. In addition, we estimated the proportion of 22 immune cell types in these samples through “The Cell Type Identification by Estimating Relative Subsets of RNA Transcripts”. Based on the ceRNA network and immune cells screened by univariate Cox analysis and Lasso regression, two nomograms were constructed to predict the prognosis of patients with KIRC. Receiver operating characteristic curves (ROC) and calibration curves were employed to assess the discrimination and accuracy of the nomograms. Consequently, co-expression analysis was carried out to explore the relationship between each prognostic gene in a Cox proportional hazards regression model of ceRNA and each survival-related immune cell in a Cox proportional hazards regression model of immune cell types to reveal the potential regulatory mechanism.
Results We established a ceRNA network consisting of 12 lncRNAs, 25 miRNAs and 136 mRNAs. Two nomograms containing seven prognostic genes and two immune cells, respectively, were successfully constructed. Both ROC [Area Under Curves (AUCs) of 1, 3 and 5-year survival in the nomogram based on ceRNA network: 0.779, 0.747 and 0.772; AUCs of 1, 3 and 5-year survivals in nomogram based on immune cells: 0.603, 0.642 and 0.607] and calibration curves indicated good accuracy and clinical application value of both models. Through co-correlation analysis between ceRNA and immune cells, we found both LINC00894 and KIAA1324 were positively correlated with follicular helper T (Tfh) cells and negatively correlated with resting mast cells.
Conclusions Based on the ceRNA network and tumor-infiltrating immune cells, we constructed two nomograms to predict the survival of KIRC patients and demonstrated their value in improving the personalized management of KIRC.
Introduction
Renal cell carcinoma is a malignant kidney tumor that the American Cancer Society reports was diagnosed in 65,340 patients, and caused 14,970 deaths, in the United States during 2018 (1). It is a serious health problem worldwide, and the economic burden of the disease has steadily increased during the last century (2). Kidney renal clear cell carcinoma (KIRC) is the subtype of renal cell carcinoma with the highest invasiveness, mortality and metastasis rates (3); it is typically screened and diagnosed through computed tomography (CT), magnetic resonance imaging (MRI) and pathological section testing in the clinic (4). Several difficulties continue to make KIRC a challenge to treat, including its stealth in early stages, a lack of effective biomarkers, radiation risks and the high cost of diagnostic imaging required. Underlying biomarkers and molecular mechanisms for the prevention, diagnosis and prognosis of KIRC require further studies.
Long noncoding RNAs (lncRNAs) contain more than 200 nucleotides without the function of encoding proteins and have achieved attention recently because of their potential for exploring novel biomarkers in diseases and for elucidating mechanisms of biological processes (5, 6). Current research reveals that lncRNA plays an important role in the genesis and development of tumors through interactions with competing endogenous RNA (ceRNA) or by acting as microRNA sponges (7). Tumor-infiltrating immune cells are a major component of the tumor microenvironment and affect the clinical outcomes and pathological staging of multiple tumor types (8, 9). Some studies suggest lncRNA can affect tumor development and tumor immune cell microenvironment through the ceRNA network (10-12). Therefore, exploring interactions between the ceRNA network and various immune cells in the development of KIRC is essential.
In this study, we construct a ceRNA network related to the occurrence and development of KIRC using transcriptome and clinical data from the Cancer Genome Atlas (TCGA) to explore potential molecular mechanisms. In addition, we used “The Cell Type Identification by Estimating Relative Subsets of RNA Transcripts algorithm” (CIBERSORT) algorithm to assess differences in the composition of immune cells within tumor tissues and normal tissues. Furthermore, predictive nomograms were constructed based on the ceRNA network and on significant immune cells. Finally, we evaluated the relationship between tumor-infiltrating immune cells and the ceRNA network to identify the underlying regulatory mechanisms.
Materials and methods
Data source and selection
A flowchart illustrating the research process of our study is provided in Figure 1. Data on lncRNA, miRNA and mRNA expression of KIRC patients were downloaded from the Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) through the GDC data transfer tool. Relevant clinical information including age, gender, survival data (vital status, days to death and days to last follow-up), grade and TNM stage of tumor tissues was also obtained for further analysis. We excluded samples with incomplete survival or pathological staging data, as well as duplicate samples. Only samples with both mRNA and miRNA expression profiles were included in this study. Correspondingly, the adjacent and paired normal tissues collected from the included patients was setted as the control group.
Differential expression analysis
Based on the HTSeq-Counts data of included samples, we used the “DESeq2” package (13) (http://bioconductor.org/packages/DESeq2/) of Bioconductor to screen for differentially expressed lncRNAs (DElncRNAs), miRNAs (DEmiRNAs) and mRNAs (DEmRNAs) between KIRC tissue and normal adjacent tissue. The log fold change (FC) criterion for differential expression was set as log2(FC) > 1 and the false discovery rate (FDR) at < 0.05. For the differential expression results of each type of RNA, volcano plots and heatmaps were generated through R packages “ggpolt2” and “pheatmap.”
Construction of lncRNA-miRNA-mRNA ceRNA network and survival analysis
Using the differential expression results, we predicted the miRNAs-lncRNAs interactions and the miRNAs-mRNAs interactions based on the starBase v2.0 database (14, 15) (http://starbase.sysu.edu.cn/starbase2/index.php), which contained the miRNA-target interactions overlapped with CLIP-Seq data. We conducted hypergeometric tests and correlation analysis to filter the predicted interactions with FDR < 0.05 as the screening criteria. In addition, the lncRNA-miRNA-mRNA regulatory network of KIRC was visualized using Cytoscape v.3.8.1 (http://cytoscape.org/) (16). Besides, we conducted a Kaplan-Meier (K-M) survival curve analysis to assess the prognostic value of genes in the KIRC ceRNA network.
ceRNA network: Cox proportional hazards regression model
Firstly, significant variables were screened for integration into the initial Cox model using univariate Cox analysis. Next, a least absolute shrinkage and selection operator (Lasso) regression was employed to further evaluate the fitness of multifaceted models and filter variables. Finally, we generated a nomogram of from multivariate Cox proportional hazards regression models to predict the prognosis of KIRC patients. To confirm the discrimination and precision of the nomogram, receiver operating characteristic curves (ROC) and calibration curves were employed. In addition, we calculated the risk score of each patient and then divided patients into high and low risk groups based on the median to conduct a K-M survival curve analysis.
Tumor-infiltrating immune cells: CIBERSORT estimation and survival analysis
To analyze differences between the two groups at the level of immune cells, and explore the connection between key biomarkers in the ceRNA network and immune cells, we estimated the proportion of 22 kinds of immune cell types in 506 tumor tissues and 71 normal adjacent tissues by CIBERSORT (17) (http://cibersort.stanford.edu/), an algorithm for characterizing cell composition of complex tissues from their gene expression profiles. Only samples with CIBERSORT output of P < 0.05 were included in further analysis. After screening samples, we searched for immune cells with significant differences between cancer tissues and normal adjacent tissues, as indicated by a Wilcoxon rank-sum test. A K-M survival curve analysis was performed to evaluate the relationship between the content of different immune cells and the overall survival of KIRC patients.
Tumor-infiltrating immune cells: Cox proportional hazards regression model
Immune cells with significant effects in univariate Cox analysis were integrated into the initial Cox model and the Lasso regression again used to evaluate the fitness of multifaceted models and filter variables further. As above, we constructed a nomogram based on the multivariate model of immune cells, and used ROC and calibration curves to confirm the discrimination and precision of the nomogram. Similarly, we conducted a K-M survival curve analysis by dividing patients into high and low risk groups according to the median of risk score. Finally, a Pearson correlation analysis was used to explore the relationship between genes and immune cells.
Statistical Analysis
Only two-sided P-values < 0.05 was considered statistically significant. All statistical analyses were conducted in R version 4.0.2 software (Institute for Statistics and Mathematics, Vienna, Austria; www.r-project.org) (packages: corrplot, DESeq2, GDCRNATools, ggplot2, glmnet, pheatmap, rms, survival, survminer, timeROC).
Results
Differential expression in lncRNAs, miRNAs and mRNAs
After screening the samples of TCGA-KIRC, we constructed an experimental group and a control group containing of 506 tumor tissues and 71 normal adjacent tissues, respectively. Compared to the control group, we identified 357 DElncRNAs (287 up-regulated and 70 down-regulated), 132 DEmiRNAs (61 up-regulated and 71 down-regulated) and 3092 DEmRNAs (2032 up-regulated and 1060 down-regulated) using the cutoffs of the |log2(FC)| > 1 and FDR < 0.05 (Figure 2a-g).
Construction of lncRNA-miRNA-mRNA ceRNA network and survival analysis
Using the starBase v2.0 database and ceRNA theory, a lncRNA-miRNA-mRNA ceRNA network consisting of 173 genes (12 lncRNAs, 25 miRNAs and 136 mRNAs) was established (Figure 3a and Table 1). A K-M survival curve analysis to assess the prognostic value of nodes in the ceRNA network revealed 83 genes (10 lncRNAs, 7 miRNAs and 66 mRNAs) related to the overall survival of KIRC patients. Survival curves of the top three (according to P-value) lncRNAs (PVT1, AC005154.1, AC015813.1), miRNAs (hsa-miR-21-5p, hsa-miR-130b-3p, hsa-miR-204-5p), and mRNAs (MXD3, VPS13D, KCNN4) are shown in Figures 3b-d, e-g and h-j, respectively.
ceRNA network: Cox proportional hazards regression model
Univariate Cox analysis applied to filter genes in the ceRNA network retained 97 genes for incorporation into the initial model (Table S1). Lasso regression analysis indicated that 15 genes were statistically significant in this model. Subsequently, a Cox proportional hazards regression model with seven genes was established through further screening of key genes based on the Akaike Information Criterion (AIC). Finally, we constructed a nomogram to predict the 1, 3 and 5-year overall survival probability of KIRC patients. ROC [Area Under Curves (AUCs)] for the model gave 1, 3 and 5-year survivals of 0.779, 0.747 and 0.772, respectively, which reflected the discrimination and precision of the nomogram with calibration curves (Figure 4). Overall survival (based on K-M curves) of the high-risk group is lower than that of the low-risk group (Figure S1). Survival curves of the genes in the model are also provided in Figure S1.
Tumor-infiltrating immune cells: CIBERSORT estimation and survival analysis
The analysis by CIBERSORT identified five normal tissue samples and 206 tumor tissue samples for inclusion in subsequent analysis (P < 0.05) and the proportions of 22 immune cells in each sample are displayed in the histogram and heat map of Figure 5. A Wilcoxon rank-sum test showed that naive B cells (P < 0.001), plasma cells (P < 0.001), CD8 T cells (P = 0.014), CD4 naive T cells (P < 0.001), CD4 memory resting T cells (P = 0.013), regulatory (Tregs) T cells (P = 0.015), gamma delta T cells (P = 0.021), resting dendritic cells (P = 0.016) and resting mast cells (P = 0.048) all varied between the two tissue sample types (Figure 5c). Based on the results of K-M survival curve analysis, five immune cells, including memory B cells, plasma cells, follicular helper T (Tfh) cells, T regulatory (Tregs) cells and resting mast cells were related to overall survival of KIRC patients (Figure S2).The clinical correlation of 22 immune cells is displayed in Figure S3.
Tumor-infiltrating immune cells: Cox proportional hazards regression model
After filtering based on univariate Cox analyses (Table S2), two of 22 immune cells (Tfh and resting mast cells) were incorporated into the initial regression model. Lasso regression analysis and AIC were then employed for further optimization of the model, and both cells were kept in the final Cox proportional hazards regression model. A nomogram was developed to predict the 1, 3 and 5-year overall survival probability of KIRC patients based on the model. Both the ROC (AUCs of 1, 3 and 5-year survivals: 0.603, 0.642 and 0.607) and the calibration curves suggested that the nomogram had good accuracy (Figure 6). The K-M survival curves suggested that significant differences existed in the overall survival of the high versus low-risk group (Figure S4).
Co-expression analysis of key genes and immune cells
Pearson correlation analysis was used to explore the co-expression pattern between 22 types of immune cells (Figure 7a). Similarly, the co-expression relationship between the key biomarkers and immune cells of the two models is illustrated in Figure 7b. Tfh cells are positively correlated with LINC00894 (Figure 7c, R = 0.36, P < 0.001) and KIAA1324 (Figure 7d, R = 0.39, P < 0.001), while resting mast cells are negatively correlated with LINC00894 (Figure 7e, R = −0.32, P < 0.001) and KIAA1324 (Figure 7f, R = −0.30, P < 0.001).
Validation of the correlation between key genes and immune cells
Multiple databases were searched to verify the correlation between the key genes (LINC00894 and KIAA1324) and immune cells (Tfh cells and resting mast cells) identified. B cell lymphoma 6 (BCL6), CXCL13, CD10 (MME), ICOS and PD-1 (PDCD1) were the most common surface markers of Tfh cells in the CellMarker database (Figure S5). In the Oncomine(tm) platform, we explored the expression of KIAA1324 and Tfh markers among various cancers. KIAA1324, BCL6, CXCL13, ICOS and PDCD1 were highly expressed and MME was downregulated in KIRC compared to normal kidney tissue (Figure S6); median rank and P-values are listed in Table S3.
Using the Gene Expression Profiling Interactive Analysis (GEPIA), KIAA1324 is positively correlated with CXCL13, ICOS and PDCD1 in KIRC (Figure S7), and LINC00894 is positively correlated with BCL6, ICOS and PDCD1 in KIRC (Figure S8). Similarly, the results of the LinkedOmics database showed that KIAA1324 was positively correlated with BCL6, CXCL13, ICOS and PDCD1 in KIRC. In addition, the expression of KIAA1324 was significantly different among tumor stages and related to tumor purity in KIRC (Figure S9).
Discussion
KIRC is one of the most common and malignant subtypes of kidney tumors (3). However, because of limitations in early detection and a lack of sensitive biomarkers, patients with KIRC typically have a poor prognosis. Recently, ceRNA was identified as playing a potentially important role in the molecular regulatory function of tumorigenesis and growth, and in affecting differences in immune-infiltrating cells within multiple tumors. In this study, we investigated the role and connection of the ceRNA network and immune infiltration in KIRC to explore key prognostic biomarkers and potential regulatory mechanisms.
We established a ceRNA network consisting of 12 lncRNAs, 25 miRNAs and 136 mRNAs and compared the composition differences of immune cells related to KIRC. Based on the results, two prediction nomograms consisting of seven genes and two immune cells, respectively, were constructed to evaluate overall survival. A correlation analysis of the key factors in both nomograms indicated that both LINC00894 (lncRNA) and KIAA1324 (protein-coding RNA) were positively correlated with Tfh cells and negatively correlated with resting mast cells. Consequently, we infer that ceRNA regulatory function, in which both LINC00894 and KIAA1324 participate, and Tfh cells and resting mast cells may play a crucial role in the development and treatment of KIRC.
LINC00894 is lncRNA derived from a locus on the X chromosome that is differentially expressed in various cancers (18). Similar to our work, Liang (19) reported that LINC00894 was a tumor-special lncRNA in KIRC that was upregulated in tumor tissues and correlated with overall survival time. Epithelial-to-mesenchymal transition (EMT) regulating miRNAs regulate drug resistance and cancer progression and metastasis (20, 21). LINC00894 can inhibit the TGF-β2/ZEB1 signaling pathway by acting as the sponge of EMT-regulating miR-200, or can be upregulated by ERα activation and positively regulate the expression of miR-200a-3p and miR-200b-3p, which results in induction of the TGF-β2/ZEB1 signaling pathway to reduce the occurrence of drug resistance in breast cancers (18, 22, 23). In our study, LINC00894 might suppress the expression of hsa-miR-342-3p, which has an important role in progression, staging and metastasis of various cancers (24-27). In view of the upregulation of hsa-miR-342-3p contributing to gefitinib resistance via targeting CPA4 in non-small cell lung cancer (28), we speculate that LINC00894 is involved in the progression of cancer and may be a promising treatment target to eliminate drug resistance in patients with KIRC.
KIAA1324, also known as EIG121, encodes a 1013 amino acid transmembrane protein that shows high sequence conservation among different species. In type I endometrial cancer (estrogen-related), the expression of KIAA1324 is upregulated, but it is downregulated in type II endometrial cancer (not estrogen-related), which is more malignant than type I (29). KIAA1324 may be a novel suppressor, being epigenetically downregulated in gastric cancer (30); in addition, its high expression in endometrial and pancreatic carcinoma predicts improved survival (31, 32). However, increased expression of KIAA1324 in ovarian cancer is associated with a poor prognosis (33). These studies indicate that KIAA1324 may exhibit different biological functions in various cancers. Furthermore, recent mechanistic studies reveal that KIAA1324 is localized to endosome-lysosome compartments and associated with autophagy, which helps protect cells from cell death by upregulating the autophagy pathway under unfavorable conditions, such as starvation or chemotherapy (34). Therefore, we infer that overexpressed KIAA1324 helps cancer cells proliferate and resist chemotherapy in KIRC.
In this study, we explored five among 22 types of immune cells related to patient prognosis, and screened out the Tfh cells and resting mast cells to construct a Cox proportional hazards regression model, which evaluation of its AUC suggests has clear clinical value. Gou et al. (35) also explored the relationship between immune cells and renal cell carcinoma, and similar to our research, found Tfh cells were associated with poor prognosis and resting mast cells were positively associated with long term survival.
Under the action of myeloid professional antigen presenting cells (APC), naive CD4+ T cells could differentiate into Tfh cells and non-Tfh effector cells (such as Th1, Th2, or Th17 cells) (36). Therefore, Tfh cells are a specialized subset of CD4+ T cells that help B cells respond to antigens (37), which is, in turn, depends on the expression of transcriptional factor B cell lymphoma 6 (BCL6) (38). In the past decade, several studies have revealed a potential connection between Tfh cells and infiltrated tumors. Amé-Thomas et al. (39) demonstrated that Tfh cells protect follicular lymphoma malignant B cells from spontaneous and rituximab-induced apoptosis by upregulating CD40L and IL-4, which might lead to a worse prognosis. In non-small cell lung cancer (NSCLC) tumor tissues, the proportion of Tfh cells is increased, suggesting that Tfh cells might play an important role in estimating the poor survival of NSCLC patients (40). In addition, Tfh cell abundance is known to improve prognosis of patients with breast cancer or colorectal cancer (41, 42), which indicates the value of Tfh cells against tumors through immune responses.
Mast cells, both resting and activated, are large granulated innate immune cells found predominantly in sites between the host and its external environment, such as skin, respiratory mucosa, and the gastrointestinal tract (43). The activation of mast cells is related to various stimuli received by numerous receptors on the cell surface, such as pathogens, neuropeptides, cytokines, growth factors, toxins, basic compounds, complement, immune complexes, etc. (44). Recent studies have shown that the association between mast cells and cancer is two-sided, including both involvement in, and protection against, tumor progression. Xu et al. (45) constructed a prognosis-associated microRNA–mast cell network in lung adenocarcinoma, and found resting mast cell numbers were closely related to better overall survival and disease-free survival (DFS), while activated mast cells were related to poor survival. miR-30a and miR-550a are also involved in the regulation of immune response by acting as the promoter and suppressor of resting mast cells, respectively. In contrast, mast cells in hepatocellular carcinoma are mostly resting, inactivation of mast cells can promote immune escape, which is beneficial to tumor growth (46).
Our research inevitably has some limitations that should be recognized. First, our research data was obtained from public databases. The limited data means that the clinical pathological parameters are incomplete, which can lead to potential errors and biases. Secondly, we have not taken the heterogeneity of the immune microenvironment associated with the location of immune infiltration into consideration. In other words, the accuracy and applicability of the prediction models will be affected by the heterogeneity of tissue subtypes. In addition, the data series used in this study are all from Western countries, so applying the research conclusions to Asian countries has certain bias and may be inappropriate. As well, the lack of cell markers for resting mast cells leads to incomplete validation in our study. Lastly, this study is not a biological mechanism research, as it lacks experiments on the interaction mechanisms between ceRNA and immune cells. Despite limitations of the study, we did combine tumor infiltrating immune cells with a ceRNA network for analysis in KIRC, which not only constructed two normograms with clinical predictive value, but also identified a correlation between two key prognostic molecules and two prognosis-related immune cells, providing novel directions for future research.
Conclusions
Based on tumor-infiltrating immune cells and ceRNA networks, we constructed two nomograms to predict survival of KIRC patients. The high AUC values we obtained reflect the utility of this approach, which provides clinicians with more comprehensive clinical information through the models to improve individual management of KIRC patients. In addition, our study suggests that LINC00894, KIAA1324, Tfh cells and resting mast cells are significantly related to the prognosis of KIRC patients and we infer that LINC00894 and KIAA1324 might interact with Tfh cells and resting mast cells to affect the progression of KIRC. This study also provides new ideas for the pathogenesis and clinical treatment of KIRC.
Data Availability
The dataset (TCGA-KIRC) analyzed in this study is publicly available in TCGA (https://tcga-data.nci.nih.gov/tcga/).The other data used to support the findings of this study are available from the corresponding author upon request.
Author contribution
ZQ and NL conceived the idea and designed the study. LD conducted the data analysis and visualization. PW performed the data interpretation and literature collection. LD and NL wrote the manuscript. ZQ and NL contributed to the revision of the manuscript draft. All authors read and approved the final manuscript.
Data availability statement
The dataset (TCGA-KIRC) analyzed in this study is publicly available in TCGA (https://tcga-data.nci.nih.gov/tcga/).
Declaration of competing interest
We declare that no potential competing interests exist.
Acknowledgement
This work was supported by National Natural Science Foundation of China (No. 81872584) and (No. 81472941), National 863 Young Scientist Program (No. 2015AA020940), Key Scientific Research Project Plan of Henan Province (No. 21A330001), Natural Science Foundation of Guangdong Province (No. 2016A030313138), Key Projects of Guangzhou Science and Technology Program (No. 201704020056), Interdisciplinary Research for First-class Discipline Construction Project of Henan University (No. 2019YLXKJC04) and Scientific Research Project for University of Education Bureau of Guangzhou (No. 201831841).