Abstract
Background Diffuse large B-cell lymphoma (DLBCL) is the predominant type of malignant B-cell lymphoma. Although various treatments have been developed, the limited efficacy calls for more and further exploration of its characteristics.
Methods Datasets from Gene Expression Omnibus (GEO) database were used for identifying the tumor purity of DLBCL. Survival analysis was employed for analyzing the prognosis of DLBCL patients. Immunohistochemistry was conducted to detect the important factor that influenced the prognosis. Drug sensitive prediction was performed to evaluate the value of the constructed model.
Results VCAN, CD3G and C1QB were identified as three key genes that impacted the outcome of DLBCL patients both in GEO datasets and samples from our center. Among them, VCAN and CD3G+ T cells were correlated with favorable prognosis, and C1QB was correlated with worse prognosis. The ratio of CD68+ macrophages and CD8+ T cells was associated with better prognosis. In addition, CD3G+ T cells ratio was significantly correlated with CD68+ macrophages, CD4+ T cells and CD8+ T cells ratio, indicating it could play an important role in the anti-tumor immunity in DLBCL. The riskScore model constructed based on the RNASeq data of VCAN, C1QB and CD3G work well in predicting the prognosis and drug sensitivity.
Conclusion VCAN, CD3G and C1QB were three key genes that influenced the tumor purity of DLBCL, and could also exert certain impact on drug sensitivity and prognosis of DLBCL patients.
Introduction
The latest refined classification by the World Health Organization (WHO) categorizes large B-cell lymphoma as a heterogeneous group of B-cell lymphomas[1]. Diffuse large B-cell lymphoma (DLBCL) is the most prevalent type among them, accounting for around 30% of all non-Hodgkin lymphomas. DLBCL can be classified into three subtypes based on its immunohistochemical expression patterns: germinal center B-cell-like (GCB), activated B-cell-like (ABC), and unclassified[2]. After undergoing R-CHOP chemotherapy, about 60% of patients achieve long-term remission; however, approximately 30% of patients experience relapse, resulting in poor prognosis and a considerable number of deaths from refractory lymphoma[3]. Consequently, to explore the characteristics of DLBCL in detail is urgently needed for developing more effective therapy.
Solid tumor tissue comprises tumor cells and the surrounding stroma, which encompasses diverse types of matrix cells, immune cells, endothelial cells[4], etc. The tumor microenvironment (TME) is a complex and dynamic system that consists of the extracellular matrix and a variety of cellular components. Recent studies have unveiled multiple subgroups of immune cells within the microenvironment of DLBCL, including T cells, B cells, NK cells, monocytes/macrophages, dendritic cells, as well as the distribution of stromal cell components like fibroblasts and endothelial cells[5, 6]. Despite the relatively limited composition of the TME in DLBCL, its role in tumor proliferation and evasion of the immune system should not be disregarded. The interaction between tumors and the microenvironment is a vital factor that impacts the development and prognosis of B-cell lymphoma[7]. Nevertheless, the existing research on the influence of the TME on the prognosis of DLBCL patients is limited and lacks a consensus.
Moreover, the comprehensive investigation of non-immune cell components in the TME is still lacking. Previous research on stroma in DLBCL has predominantly indicated that a higher quantity of extracellular matrix is associated with a more favorable prognosis, while increased vascular density is associated with poorer prognosis[8]. Furthermore, higher stromal scores have been associated with an improved prognosis in DLBCL patients[9]. Additionally, a fibrotic tumor microenvironment has been correlated with a better prognosis after DLBCL chemotherapy and immunotherapy[10]. These research findings stem from computational analysis of stromal and immune scoring in gene databases and have not been experimentally validated as of yet.
Tumor purity quantifies the relative ratio of tumor cells to the surrounding stromal components in solid tumors, elucidating the dynamics between tumor cells and their microenvironment[11]. It can partly reflect the characteristics of TME, namely, a higher tumor purity indicates a lower abundance of stromal components in TME. Tumor purity is associated with patient prognosis, and the strength of this association varies across different tumor types[12-14]. Therefore, when investigating the influence of TME on the prognosis of DLBCL, it is crucial to analyze not only the immune cell components but also the significance of non-immune cell components.
This study utilized bioinformatic analysis to establish the relationship between immune and stromal components and the prognostic outcomes of DLBCL patients. We developed a novel immunohistochemical panel to assess prognostic outcomes and treatment sensitivity by detecting the expression of VCAN, CD3G, C1QB, CD68, CD4 and CD8 in both the TME and tumor cells of 190 DLBCL patients. We then explored their relationship with DLBCL clinicopathological features as well as overall survival (OS).
Materials and Methods
Data collection and tumor purity-related genes (TPGs) selection
The RNA-Sequence and clinical data of GSE53786 and GSE32918 datasets were download from Gene Expression Omnibus (GEO) database. The first gene symbols of GSE53786 datasets were retained when one probe detected multiple genes. Average expression value of genes in each dataset were calculated and used when one gene was detected by multiple probes. Tumor purity was assessed by ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data) algorithm and the then its correlation with genes expression was analyzed. The genes with | r | ≥ 0.5 and p value < 0.05 was defined as the TPGs.
TPGs function analysis
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were executed to analyze the biological processes, cellular components, molecular functions and pathways related to the TPGs. The statistical significance was considered as p.adjust < 0.05. The protein-protein interactions (PPI) analysis was utilized to investigate the interaction among TPGs, and those with interactive confidence greater than 0.90 on the STRING platform (version 11.5) were selected to establish an interaction network with Cytoscape software (version 3.8.2).
Prognostic model
The prognostic model was constructed with “survival” package in R (version 4.1.3). The genes enrolled in this model was selected among the prognostic and PPI hub TPGs by function “step” in “survival” package, which can optimize the model. The prognostic model was represented by
Clinical specimens and follow-up
190 patients from Cancer hospital Chinese Academy of Medical Sciences, the CHCAMS cohort, were enrolled in this study (Supplementary table 1). All patients received surgery or biopsy during September, 2010 and September, 2020, and then standard follow-ups were carried out until March, 2023. The overall survival (OS) was defined as the interval between the operation and death or the last follow-up. The specimens from the CHCAMS cohort were used for immunohistochemistry assay. The study was designed according to the Declaration of Helsinki and approved by the institutional ethics committee of Cancer Hospital Chinese Academy of Medical Sciences. Informed consent was taken from all the patients.
IHC
Paraffin embedded DLBCL tissues of CHCAMS Cohort were used for immunohistochemistry (IHC). After de-paraffinization and hydration, heat-induced method was performed for antigen retrieval. Primary antibody of VCAN (AB177480, 1:100, Abcam, USA), CD3G (AB134096, 1:1000, Abcam, USA), C1QB (AB92508, 1:50, Abcam, USA), CD68 (303565, 1:1000, Abcam, USA), CD4 (ZM-0418, ZSGB-BIO, China), CD8 (ZA-0508, ZSGB-BIO, China) was incubated at 4°C overnight. Sections were washed with TBS-T buffer, and then incubated with secondary antibody, and finally stained with DAB. The quantitative analysis of the slices was conducted by QuPath-0.4.3. VCAN and C1QB were assessed by H-score, and CD3G, CD68, CD4, CD8 were assessed as the ratio of the corresponding positive cells among all cells.
Drug sensitivity prediction
Drug sensitivity prediction was conducted utilizing “oncoPredict” packages in R 4.1.3. The drug sensitivity data was collected from Genomics of Drug Sensitivity in Cancer (GDSC). The drugs that was analyzed in this study was selected according to clinical practice or clinical trials searched in Pubmed.
Statistical analysis
Data in this study was shown in the form of mean ± SEM. Correlation between two variates was determined with Spearman analysis. Kaplan–Meier (K–M) curve and Log rank test were used for survival analysis. The cut-offs of survival analysis were provided by X-tile. The independent risk factor analysis was performed with Cox regression analysis. Receiver operating characteristic (ROC) curve was used for test the efficacy of prognostic model. The clinicopathological characteristics difference analysis was conducted with χ2 test, Fisher’s exact test or Wilcoxon rank sum test. The drug sensitivity scores were compared with Wilcoxon rank sum test. In this study, p < 0.05 were considered statistically significant.
Results
Tumor purity related genes were correlated with extracellular matrix organization and immune response
Based on GSE53786 dataset, we first assessed the tumor purity of DLBCL, which ranged from 17.2% to 67.4% (Figure 1A). In order to screen out the TPGs, we then analyzed the correlation between genes expression and tumor purity. According to the thresholds mentioned above, 642 genes were identified as TPGs, among which 31 genes were positively correlated with tumor purity, while 611 genes were negatively correlated with it (Figure 1B). In addition, tumor purity did have influence on the prognosis of DLBCL patients, which showed that patients with high tumor purity had lower OS rate than those with low tumor purity (Figure 1C, p = 0.025).
Next, we performed GO and KEGG enrichment analyses to explore the functions and signaling pathways in which these TPGs were involved. It turned out that the TPGs were mainly associated with extracellular matrix organization and immune response (Figure 1C, 1D). Not only did the enrichment results confirm that these genes were reliable to be related with the tumor purity, but it also laid solid foundations for the sequent analyses.
A prognostic model was constructed with three TPGs
With the 642 TPGs, we exerted PPI analysis to investigate their interaction and the hub genes (Figure 2A). The TPGs who had five or more interactive genes were shown in Figure 2B, and defined as hub genes. Then, we performed univariate Cox regression analysis to figure out the TPGs that were associated with the prognosis of DLBCL patients, and 103 genes were identified (Figure 2C). Interestingly, most of the TPGs were correlated with good outcome (with HR < 1), and only six genes were associated with poor outcome (with HR > 1). Through conducting intersection analysis, we found nine genes (LUM, VCAN, YAP1, COL5A2, SDC2, TWIST1, CD3G, C1QB and C3) were intersection genes, indicating that they had an active effect in modulating the tumor purity, as well as influencing the prognosis of DLBCL patients.
After ascertaining the key genes, we tried to construct a prognostic model with them. The model was constructed by Cox regression, and the three selected genes (VCAN, CD3G, C1QB) and their parameters like coefficient, HR and 95%CI of HR, were shown in Figure 3A. It showed that VCAN and CD3G were correlated with good prognosis and C1QB was correlated with poor prognosis. All patients were divided into high and low-risk group according to the median value of riskScore (Figure 3B). As expected, the high-risk group has worse prognosis than the low-risk group (Figure 3C). In addition, the three genes were differentially expressed between high and low-risk group, with VCAN and CD3G showing high expression level in low-risk group, and C1QB showing high expression level in high-risk group, which was consistent with the coefficient (Figure 3D). To appraise the efficacy of these prognostic model, we conducted survival analysis and ROC analysis. High-risk group had lower OS rate than low-risk group (Figure 3E, p < 0.001), and the areas under curve (AUC) for 1-year, 3-year and 5-year ROC were 0.73, 0.77 and 0.77 respectively (Figure 3F). Just similar to tumor purity, the high riskScore indicated bad outcome, which was consistent with the positive correlation between tumor purity and riskScore (Figure 3G). This TPGs signature prognostic model manifested satisfying prognostic efficacy.
When we applied this model to GSE32918 dataset, it still did excellently and the results were in accordance with that in GSE53786 dataset (Figure 4A, 4B; Supplementary Figure 2). Next, we analyzed the relationship between riskScore and some clinicopathological characteristics provided in GSE53786 dataset. The results showed that high-risk group had more ABC type DLBCL, while low-risk group had more the GCB type DLBCL (Figure 4C). Besides, high-risk group displayed higher lactic dehydrogenase (LDH) ratio (Figure 4F). However, the Eastern Cooperative Oncology Group (ECOG) performance and stage was not associated with the riskScore (Figure 4D, 4E). Still, the high-risk group has more Stage III and Stage IV patients, but less Stage I patients than low-risk group. Finally, we employed the univariate and multivariate analysis to explore whether riskScore was an independent prognostic factor for DLBCL patients. As expected, the riskScore was associated with the poor prognosis (Figure 4G, p < 0.001, HR = 1.545, 95%CI 1.284–1.861) and was an independent prognostic factor (Figure 4H, p = 0.002, HR = 1.474, 95%CI 1.156–1.879).
The prognostic value of VCAN, CD3G and C1QB were validated by IHC assay
With the purpose of the further validation of the prognostic value of VCAN, CD3G and C1QB, we detected the expression of these genes in CHCAMS cohort by IHC. For VCAN, the patients were divided into high and low group according to the cut-off of the H-score (275.42) provided X-tile. The survival analysis showed that patients with high expression of VCAN had higher OS rate (Figure 5A, p = 0.003). For CD3G, previous study revealed that it was a component of T cell receptor complex, for which it could be regarded as a marker of T cells. Therefore, we assessed the expression level of CD3G by counting the CD3G+ T cells ratio, and divided patients by the cut-off (2.5%). The survival analysis revealed that patients with high CD3G+ T cells infiltration showed favorable prognosis (Figure 5B, p < 0.001). For C1QB, the patients in high expression group (cut-off = 82.41) showed adverse prognosis (Figure 5C, p = 0.015). Although the detection of protein level was not convenient to build a prognostic model for the difference of assessment methods and the lake of coefficient, these results were in accordance with those of GEO datasets, which successfully proved the prognostic value of VCAN, CD3G and C1QB.
Given that these genes could potentially influence the tumor purity of DLBCL, we then analyzed the relationship between them and CD68+ macrophages, CD4+ T cells and CD8+ T cells. As was shown in Supplementary figure 3A, CD68+ macrophages [(17.75±1.05) %] account for more ratio than CD4+ T cells [(0.68±0.20) %] and CD8+ T cells [(6.69±0.56) %] (p < 0.001, Kruskal-Wallis Test and Dunn’s Test). In the survival analysis of these three types of immune cells, we found that CD68+ macrophages, CD8+ T cells and CD4+ T cells were associated with better prognosis (Figure 5D–F, p = 0.029, p = 0.002, p = 0.053). And XCELL and QUANTISEQ algorithm revealed that M2 macrophages accounted for more proportion than M1 macrophages in GSE53786 and GSE32918 (Supplementary figure 3L-O). Besides, the ratio of CD3G+ T cells was positively correlated with that of CD68+ macrophages, CD8+ T cells and CD4+ T cells, C1QB expression level was positively correlated with CD8+ T cells, and VCAN expression level was positively correlated with CD8+ T cells ratio (Figure 5G).
In addition to the above analyses, we also explored the relationship between these three genes and the proliferation and location of DLBCL. It turned out that CD3G+ T cells ratio was higher in DLBCL originated from groin and testis, and VCAN featured higher expression in lymph node originated DLBCL (Figure 5H–I, p < 0.05, p < 0.01, Supplementary figure 3B–K).
The TPGs signature model could also predict the drug sensitivity of DLBCL patients
In order to learn about the ability of the previously mentioned model to predict drug sensitivity, we performed the prediction with “oncoPredict” package in R. Fifteen drugs (Supplementary table 2) included in the GDSC and used in clinical practice or under clinical trials (searched on Pubmed) were enrolled in this prediction analysis.
As is shown in Figure 6A (prediction of GSE53786), patients in high-risk group could be sensitive to Carmustine, Cytarabine, Oxaliplatin, Vincristine, Vorinostat, and Bortezomib, but no drug could work better in low-risk group. And in GSE32918 (Figure 6B), Carmustine, Cytarabine, Oxaliplatin, Vorinostat, Afuresertib, Bortezomib, Ibrutinib and Tamoxifen could work better in high-risk group, and Vincristine (sensitivity score: low-risk vs high-risk = 0.219±0.026 vs 0.223±0.031) could work better in low-risk group. The discrepancy between the prediction in two datasets might be due to the samples and sequencing platforms. However, the intersection analysis of the drugs to which the high-risk patients in both datasets could be sensitive revealed that Carmustine, Cytarabine, Oxaliplatin, Vorinostat and Bortezomib could be reliable candidates for treating high-risk patients based on the three TPGs signature prognostic model (Supplementary table 2).
Discussion
In this study, bioinformatics techniques were employed to identify three genes (VCAN, CD3G, C1QB) that exhibit associations with prognosis in both immune and stromal environments, thereby revealing their relationship with the prognosis of DLBCL patients. The findings indicate that higher expression of VCAN, increased infiltration of CD3G+ T cells, and decreased expression of C1QB are correlated with favorable prognostic outcomes. Conversely, a lower infiltration of CD68+ macrophages and lower infiltration of CD8+ T cells are associated with poorer prognosis. Furthermore, we investigated the relationship between risk genes related to tumor purity and treatment sensitivity and established a list of possible drugs that might be helpful for enhancing outcomes.
Previous studies have extensively investigated the VCAN gene in relation to tumorigenesis and metastasis[15]. VCAN, also known as versican, is a crucial component of extracellular matrix[16], and exists in several isoforms[17]. Research has shown that VCAN plays a multifaceted role in TME depending on the cell type expressing it. When expressed by myeloid cells, VCAN induces an anti-inflammatory and immunosuppressive microenvironment. Conversely, its expression by stromal cells typically leads to a pro-inflammatory response[18]. In gastric cancer, high expression of VCAN has been associated with increased infiltration of fibroblasts, significant enrichment of stromal-associated signaling pathways and poor prognosis[19]. In hepatocellular carcinoma, VCAN exhibits a strong association with immune checkpoint gene expression[20]. Despite these findings in other tumor types, the role of VCAN in DLBCL has not been explored yet. Our study reveals that high expression of VCAN is actually associated with a more favorable prognosis. This suggests that VCAN may have different functions in different tumor types. One possible mechanism through which VCAN influences prognosis is that VCAN overexpression in DLBCL may also impact tumor cell proliferation. A study has shown that overexpression of VCAN V1 has an inhibitory effect on cell proliferation, partly due to its promotion of activation-induced cell death in lymphoid cell lines[17]. Hence, the high expression of VCAN in DLBCL could impact not only the TME but also tumor cell proliferation, suggesting a potential mechanism for the observed preferable prognosis.
C1q is synthesized in the tumor microenvironment and functions as an extracellular matrix protein, and C1QB is a component of C1q[21]. Previous studies have provided insights into the diverse roles of C1q in cancer progression. However, the majority of these results, as observed in non-small cell lung carcinoma and gastric cancer, indicate that high C1q expression in TME is associated with a poor prognosis [22-24]. Additionally, C1QB has been found to exert an impact on the TME and is positively associated with infiltration levels of CD8+ T cell, as well as with M1 and M2 macrophages in osteosarcoma[25]. Moreover, C1QB expression shows a positive correlation with predictive biomarkers for immunotherapy, such as PD-L1 expression and CD8+ T cell infiltration[24]. Furthermore, in malignant melanoma, C1QB promotes proliferation, migration and invasion, while inhibiting cell apoptosis[26], and the high-expression group exhibits significant enrichment of genes related to immune and apoptosis[21]. In our study, we found that high expression of C1QB in DLBCL was associated with a worse prognosis and positively correlated with CD8+ T cells infiltration. Based on these findings, we propose that C1QB in DLBCL might share similarities with its functions in other tumor types, particularly regarding the promotion of recruitment and subsequent deactivation of CD8+ T cells within the TME through the induction of immune checkpoint effects. These results shed light on the intricate role of C1QB in TME and its potential significance as a prognostic marker in DLBCL.
CD3G is a member of the TCR/CD3 complex primarily expressed in lymphocytes subgroups. It plays a crucial role in initiating the activation of T cells[27]. It is also involved in coupling antigen recognition[28]. It is reported to associate with long-term OS and good prognosis in breast invasive carcinoma[29] as well as in head and neck squamous cell carcinoma[30]. However, its role in DLBCL has not been fully explored. In our study, we revealed that high infiltration of CD3G+ T cells is correlated with good prognosis. The infiltration of CD3G+ T cells was found to be positively related to the infiltration of CD8+, CD4+ and CD68+ cells. This indicates that CD3G+ T cells in DLBCL may enhance the tumor antigen recognition process and stimulate the infiltration of immune cells, leading to an increased abundance of immune cell infiltration in the TME. The presence of CD3G+ T cells in the TME may contribute to a favorable prognosis by facilitating the activation of immune responses against tumor cells.
Macrophages play a crucial role in TME, and CD68 is a surface marker specific to macrophages. Macrophages can be roughly classified into two types based on their functional features: M1 or M2. M1 macrophages exert anti-tumor effects, whereas M2 macrophages promote tumor growth and progression in TME[31]. A previous study found that low infiltration of CD68+ macrophages was associated with an inferior prognosis[32]. Similarly, our study has yielded similar results, revealing a noteworthy correlation between a high proportion of CD68+ macrophages in the TME and improved prognosis. Additionally, by analyzing the datasets, we observed a higher proportion of M1 macrophages infiltration compared to M2 macrophages. This suggests that, within our DLBCL cohort, these macrophages may also exhibit the M1 phenotype and consequently play a protective role against tumor progression.
CD8 is widely recognized as a marker of CD8+ T cells, also known as cytotoxic T cells[33]. These cells are crucial for the immune response against tumors. However, in DLBCL, CD8+ T cells exhibits elevated levels of inhibitory molecules on their surface, such as PD-1, PD-L1, TIM3. High expression of TIM3, an inhibitory immune checkpoint receptor, on CD8+ T cells has been associated with tumor progression and poor outcomes[34, 35]. These inhibitory molecules may impair the function of CD8+ T cells and hinder their anti-tumor activity. Surprisingly, our study demonstrates a correlation between the infiltration of CD8+ T cells and favorable prognosis in DLBCL. Here, we propose a hypothesis that in our study, the observed high expression of VCAN might create a suppressive environment for PD-1+ CD8+ T cells[18, 36]. Intriguingly, our study revealed a statistically significant correlation between VCAN expression, C1QB expression and CD8+ T cell infiltration. VCAN has the potential to modulate immune infiltration by reducing the immunosuppressive phenotype of immune cells[37], thus enabling a more efficient anti-tumor response. This aspect is still worth of consideration.
Taken together, our findings underscore the significant roles of VCAN, CD3G, C1QB, which influence both the TME and the behavior of tumor cells. The interaction between each component and the TME is rather complicated. To fully comprehend the underlying mechanisms and identify potential therapeutic targets in DLBCL, further investigation is required.
However, this study still has several limitations that should be addressed. Firstly, the patients included in this study were form a single center, which may introduce biases into the results. Although we made efforts to minimize these biases, it is inevitable that some may persist. Secondly, we hypothesized that VCAN, CD3G and C1QB could serve as continuous prognostic parameters, thereby eliminating the need for a cut-off. However, the methodology used in this study, which utilized IHC staining to assess the protein expression levels, may have potential limitations. While IHC is a widely used technique, additional validation is needed to confirm the prognostic value of VCAN, CD3G and C1QB in DLBCL. Furthermore, due to the potential variability in interpreting IHC results across different centers, a standardized coefficient and formula have not been established to calculate the final prognostic index for patients with DLBCL. Developing a standardized approach would be beneficial in ensuring consistent and accurate interpretation of IHC results. To address these limitations and expand upon our findings, future studies should strive to incorporate a diverse range of patients from multiple centers. Additionally, it is crucial to employ rigorous experimental techniques to authenticate the prognostic significance of VCAN, CD3G, and C1QB in DLBCL.
Data Availability
The GEO datasets can be obtained from https://www.ncbi.nlm.nih.gov/gds/?term=. Due to the nature of this study, the data of CHCAMS cohort can be accessed from Dr. Wenting Huang upon reasonable request.
Author contributions
Conceptualization: Wenting Huang, Ning Huang, Zhenbang Ye, Data Curation: Zhenbang Ye, Ning Huang, Formal Analysis: Zhenbang Ye, Ning Huang, Investigation: Zhenbang Ye, Methodology: Zhenbang Ye, Ning Huang, Wenting Huang, Resources: Wenting Huang, Yongliang Fu, Supervision: Wenting Huang, Visualization: Zhenbang Ye, Ning Huang, Validation: Ning Huang, Writing-Original Draft: Zhenbang Ye, Ning Huang, Writing-Review & Editing: Wengting Huang, Yongliang Fu, Rongle Tian
Funding
This work is supported by Shenzhen High-level Hospital Construction Fund. The funders had no role in study design, data collection and analysis, interpretation of data, or preparation of the manuscript.
Conflicts of interest
The authors made no disclosures.
Acknowledgement
Not applicable.