Introduction

Allogeneic hematopoietic stem cell transplantation (allo-HSCT) remains one of the most effective therapies for patients with high-risk hematologic malignancies such as AML and myelodysplastic syndromes (MDS). Allo-HSCT using matched related donor (MRD) or HLA-matched unrelated donor (MUD) is the treatment of choice for patients with available matched donors (MD). Umbilical cord blood (UCB) has emerged as an alternative source of hematopoietic stem cells for patients lacking HLA-matched donor, and has lower risk of disease relapse in high-risk leukemia patients.1 The curative effect of allo-HSCT is largely mediated by the graft-versus-leukemia (GVL) effects, but partially offset by transplant-related mortality and loss of quality of life caused by the GVHD.2 Previous studies indicated that patients with GVHD, especially chronic GVHD, are likely to have lower risk of relapse, but increased non-relapse mortality and relatively poor overall survival.2, 3

Both GVL and GVHD are mediated by the engrafted T lymphocytes targeting leukemic cells and host tissues, respectively.2, 3, 4 Before transplantation, HSCT recipients undergo conditioning regimens to eradicate their malignant cells as well as host immune cells that may cause graft rejection. However, since these conditioning regimens include cytotoxic agents and radiation, they cause tissue injury which enhances the induction and secretion of proinflammatory cytokines, and subsequently trigger activation of donor-derived T-cells. These activated donor-derived T-cells that recognize Ag(s) presented by HLA molecules appear to mediate cytotoxicity against the leukemic cells as well as certain host cells.4, 5 Several studies established the role of activated donor-derived cytotoxic T-cells in the pathophysiology of acute GVHD.6, 7, 8 In addition, lower levels of regulatory T-cells have also been suggested to be associated with acute and chronic GVHD.9, 10

T-cell receptor (TCR), a heterodimer of TCR alpha and beta chains, or TCR gamma and delta chains, is present on the surface of T-cells. More than 95% of T-cells in our body express heterodimers of TCR alpha and beta chains and the remaining 5% of T-cells express TCR gamma and delta chains.11 The TCR alpha gene is reported to consist of 70 variable (V) exons, 61 joining (J) exons and one constant (C) exon, while the TCR beta gene possesses 60 V exons, 2 diversity (D) exons, 15 J exons and two C exons. In addition, to greatly increase the number of Ags that T-cells can recognize, TCR genes undergo somatic V(D)J recombination that increases the diversity of T-cell repertoire extensively. During the recombination, nucleotides are deleted from V, (D), J exons and/or inserted between a V–J junction for alpha, and V–D and D–J junctions for beta.11, 12, 13 Hence, the complementarity-determining region 3 (CDR3) of TCRs increases the diversity through the somatic recombination and prepares the immune system to recognize the huge variety of Ags we may be exposed to.11

Although several studies have examined the TCR repertoire in the patients undergoing allo-HSCT, the majority of these studies were limited by their approaches.14, 15 The use of DNA as a starting materials rather than RNA in some studies limited their ability to address the functional assessment of the T-cell population. Furthermore, TCR-specific multiplex PCR-based assays utilized by these previously published reports restricted these analyses to the known TCRs variable exons and did not allow for a comprehensive unbiased assessment of the TCR repertoire or to identify novel exons that have not been deposited in the public genomic sequence database.14, 15, 16 Therefore, in order to perform a comprehensive analysis of the TCR repertoire following allo-HSCT, and to examine the kinetic of the TCR profiles and how they relate to the development of GVL or GVHD, we have applied a mRNA-based, 5′RACE (rapid amplification of cDNA ends) PCR method to amplify all the possible combination of TCRA and TCRB using a single set of primers for each of TCRA and TCRB. This was followed by next-generation sequencing (NGS) technology approach which has enabled us to obtain millions of TCR sequences in one experiment and unravel the complexity of T-cell diversity.15, 16 Here, we describe the detailed TCR repertoire recovery in allo-HSCT patients receiving MD or haploidentical cord (haplo-cord) transplant. We also examined how TCR repertoire and diversity vary by the different source of donor stem cells and how they correlate with clinical variables (relapse or GVHD). TCR repertoire profiling of HSCT patients may lead to a better understanding of the immunological response of patients after HSCT, in order to modulate the GVL-GVHD balance to improve patient’s outcome.

Materials and methods

Patients

A total of 21 transplant patients was examined in this study. Twenty patients had AML and one had MDS. The median age of the patients was 56-years old (range: 26–73). The characteristics of patients are summarized in Table 1. Twelve patients received T-cell-depleted MD transplant, while nine patients received haplo-cord transplant. The pre-transplant conditioning regimen consisted of either fludarabine/melphalan/campath or clofarabine/ melphalan/campath in patients who received MD transplant, and fludarabine/melphalan/anti-thymocyte globulin in patients who received haplo-cord transplant. Haplo-cord transplant patients were high-risk AML enrolled on a clinical trial of at the University of Chicago. Three patients had active disease at the time of transplant. The median UCB dose was 1.6 × 107 total nucleated cells (TNC)/kg with HLA cord matching of 4/6 in two patients and 5/6 or 6/6 in seven patients. Among the transplant patients, 11 of them were diagnosed as acute GVHD patients.17 Peripheral blood mononuclear cells (PBMCs) were obtained from the patients before transplant and at different time points after transplantation. The study was approved by the institutional review board and all patients gave written informed consent on University of Chicago Institutional Review Board approved treatment trials that included sequential correlative samples (IRB #14736, #14341B and #11300A).

Table 1 Characteristic of patients

RNA isolation and PCR amplification

Total RNAs were isolated from PBMCs (~5 × 106 cells) using RNeasy mini kit (Qiagen, Valencia, CA, USA) and treated with DNase to remove genomic DNA contamination. cDNA was then synthesized using the SMART cDNA library construction kit (Clontech Laboratories, Mountain View, CA, USA), according to the manufacturer’s instructions. A common adaptor (SMART IV oligonucleotide) was ligated to the 5′-end of cDNA. Two PCRs were then performed to amplify all the possible combination of TCR alpha and beta from cDNA, using one common forward primer which is designed based on the sequence of SMART IV adaptor and included Ion truncated P1 adaptor sequence (5'-CCTCTCTATGGGCAGTCGGTGATTATCAACGCAGAGTGGCCAT-3') and two reverse primers, each specific to the constant region of TCR alpha (5'-CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGGGTCAGGGTTCTGGATA-3') or beta (5'-CCATCTCATCCCTGCGTGTCTCCGACTCAGTCTGATGGCTCAAACACAGC-3') and included the Ion A1 adaptor sequence.18 The PCR conditions were 94°C for 3 min, followed by 40 cycles of 94 °C for 30 s, 65 °C for 30 s and 68 °C for 1 min.

Template preparation and sequencing

The libraries were further amplified onto the proprietary Ion Sphere particles (ISP) by emulsion PCR using Ion OneTouch2 system and the Ion PGM Template OT2 400 kit (Life Technologies, Carlsbad, CA, USA), according to the manufacturer’s instructions. ISPs with templates were then enriched using the Ion OneTouch ES instrument (Life Technologies). Finally, the fully prepared products were subjected to sequencing using the Ion PGM Sequencer (Life Technologies), a 400 bp reading kit and Ion 318 Chip v2.

Sequence analysis

Sequencing reads in FASTQ files were mapped to the reference sequences derived from IMGT/GENE-DB (http://www.imgt.org)19, 20 using Bowtie2 aligner (Version 2.1.0)21 The V, D, J genes were designated according to the nomenclature provided by the international ImMunoGeneTics information system (IMGT). A CDR3 region was defined by identifying the second conserved cysteine encoded in the 3′ portion of the V segment and the conserved phenylalanine encoded in the 5′ portion of the J segment that form the boundaries of the CDR3. The nucleotide sequences between both conserved TCR V cysteine and TCR J phenylalanine were extracted to determine the amino acid sequence of CDR3 region.

Statistical analysis

The inverse Simpson’s index was calculated based on the following equation:

where K is the total number of clonotypes, ni is the number of sequences belonging to the i-th type and N is the total number of sequences for which clonotypes are determined.

Unpaired Student’s t-test (two-tailed) was performed for comparison of total proportion of the most abundant ten CDR3 sequences or the diversity index between groups, using GraphPad Prism version 6.0 (La Jolla, CA, USA). A P-value of <0.05 was considered statistically significant.

Results

V(D)J combination and CDR3 sequence analysis

We sequenced the TCR alpha (TCRA) and beta (TCRB) cDNAs synthesized from RNAs that were isolated from PBMCs obtained at several time points from 21 patients who received HSCT. Using a 5′ RACE PCR approach, we obtained an average sequence reads of 2 663 492 for TCRA and 1 509 588 for TCRB. The observed sequence reads allowed us to identify 45 TCRAV and 59 TCRBV exons, which represented the majority of the 47 functional V exons in TCRA and 60 functional V exons in TCRB (Supplementary Figures 1a and b), indicating a good coverage of TCR gene by our cDNA sequencing approach. In addition, interestingly, recombination between TRDV1 and various J exons in TCRA was observed in our samples, indicating that this exon, although defined as a V exon of the TCR delta gene, may also function as a V exon of TCRA (Supplementary Figure 2). We validated it by performing Sanger sequencing and the aligned sequences are shown in Supplementary Figure 3. After defining VJ combinations for TCRA and V(D)J combinations for TCRB, we further defined individual CDR3 sequences using our newly developed algorithm. On average, we were able to identify 28 924 unique CDR3 sequences for TCRA and 26 714 unique CDR3 sequences for TCRB which have open reading frames.

Cord-chimerism correlates with higher TCR repertoire diversity in hapo-cord transplant

We assessed the TCR diversity by calculating the inverse Simpson’s diversity index (1/Ds) (Supplementary Table S1). As expected, the diversity for TCRA and TCRB was significantly lower in samples obtained after transplant in comparison with pre-transplant samples (P=0.0054 and 0.015 for TCRA and TCRB, respectively) (Figures 1a and b). In order to investigate a correlation between the source of donor cells in the haplo-cord transplanted patients and TCR repertoire reconstruction after transplantation, we correlated the diversity of TCRs with the proportion of cord cells in patient’s blood at different time points. Patients were dichotomized into two groups according to the median percentage of cord-derived cells at day 30 (⩽6% vs >6%) after haplo-cord transplantation (Figure 2a), and compared the diversity at day 100 between the two groups. We found that TCRA and TCRB of patients in the >6% group at day 30 were significantly more diverse at day 100 than those of ⩽6% group (TCRA; 116.1±17.25 vs 23.87±9.84 and TCRB; 89.59±15.49 vs 21.84±10.51) (P=0.0017 and P=0.0072 for TCRA and TCRB, respectively) (Figures 2b and c). The same correlation was observed when we examined the diversity on day 100 and percentage of cord-derived cells on day 100 post transplant (data not shown).

Figure 1
figure 1

Comparison of diversity index of samples. (a) In all patients, the diversity for TCRA (a) and TCRB (b) before transplantation were compared with that after transplantation.

Figure 2
figure 2

Reconstruction of TCR repertoires observed in the haplo-cord transplant patients at day 100. (a) Distribution of patients according to the median of percentage of cord-derived cells at day 30 or day 100 after haplo-cord transplant (n=9). The diversity of TCRA (b) and TCRB (c) of patients with >6% (n=4) cord-derived cells at day 30 were compared with that of patients with ⩽6% cord-derived cells (n=5). The diversity of TCRA (d) and TCRB (e) of MD patients were compared with that of haplo-cord patients.

However, when we compared the TCR diversity at day 50–100 between patients who underwent MD and those underwent haplo-cord transplant we found no statistical difference in the diversity of TCRA or TCRB (P=0.27 and 0.72, for TCRA and TCRB, respectively, Figures 2d and e). Therefore, we could combine both groups for further correlative analysis.

Expansion of TCR clones in repertoire of patients with GVHD

To examine whether enrichment of certain T-cells may be involved in the development of GVHD, we analyzed changes over time from baseline and sequentially through the development of acute GVHD. On the basis of their V(D)J combination and defined CDR3 sequences (by insertion and deletion of the nucleotides at the rearranged junction), we sorted independent cDNA sequences according to their number of appearance in the sequence reads from the most to least abundant. We demonstrated the 10 most abundant CDR3 sequences at the time point closest to the date when each patient was diagnosed with acute GVHD. We observed that certain T-cell clones were expanded in each of the GVHD patients and were very low at any other time points. We then compared the combined proportion of top ten CDR3 sequences at the time point closest to acute GVHD diagnosis and multiple time points of non-GVHD patients after HSCT. We found significantly stronger enrichment of the ten most abundant T-cell clones of TCRB in GVHD patients than those with non-GVHD patients (P=0.0029). On the other hand we only observed a very modest tendency in the difference in TCRA (P=0.18) (Figures 3a and b).

Figure 3
figure 3

Comparison of the proportion of the top 10 CDR3 sequences and diversity between non-GVHD and GVHD patients. In all patients, the proportion of top 10 CDR3 of TCRA (a) and TCRB (b) of non-GVHD (n=10) were compared with that of GVHD patients (n=11). In the non-relapsed group, the proportion of top 10 CDR3 of TCRA (c) and TCRB (d) of GVHD patients (n=6) were compared with that of non-GVHD patients (n=7). In the relapsed group, the proportion of top 10 CDR3 of TCRA (e) and TCRB (f) of GVHD (n=5) were compared with that of non-GVHD (n=3). In the non-relapsed group, the diversity of TCRA (g) and TCRB (h) of non-GVHD patients (n=7) were compared with that of GVHD patients (n=6). In the relapsed patients, the diversity of TCRA (i) and TCRB (j) of GVHD (n=5) were compared with that of non-GVHD (n=3) patients.

In order to further investigate whether the observed changes in TCR repertoire and diversity were GVHD-related and GVL-independent, we further compared the TCR repertoire between GVHD and non-GVHD patients based on their relapse status. Interestingly, we found that only in non-relapsed patients, the proportions of the 10 most abundant clones (indicating strong clonal expansion) for both TCRA and TCRB were significantly higher in the GVHD group than in the non-GVHD group (P=0.0031 and 0.00013 for TCRA and TCRB, respectively; 16.17±2.09% vs 37.33±7.73% for TCRA, 14.58±2.29% vs 35.67±3.86% for TCRB) (Figures 3c and d). However in the relapsed patient group, we observed no significant difference in the proportion of the top 10 abundant clones between the GVHD and non-GVHD patients (Figures 3e and f). Similarly, we found that diversity of TCRA and TCRB was significantly lower in GVHD patients compared with non-GVHD patients in the non-relapsed group (P=0.011 and 0.0064, for TCRA and TCRB, respectively), (Figures 3g and h), but not in the relapsed group (Figures 3i and j).

Expansion of similar V–J in the GVHD patients with shared HLA alleles

We analyzed the similarity and differences in the V(D)J combination along with CDR3 sequences in the GVHD patients using principal component analysis (PCA) (data not shown). During the analysis, we found enrichment of the same V(D)J combination in both alpha and beta chains in two GVHD patients (Pt# 1 and Pt# 14) who shared five of the six major HLA alleles (HLA-A*02, HLA-B*07, HLA-C*07, HLA-DR*04, HLA-DQ*03 and HLA-DQ*06). In these two patients, the TRAV-J (TRAV38-2/DV8-TRAJ30) combination was found at the frequency of ~4.18% and 9.58%, and TRBV15-1-TRBJ2-1 was observed at the frequency of 3.08 and 5.95% (Supplementary Table S2a). Among all the CDR3s found in the TRAV38-2/DV8-TRAJ30 combination, the most abundant one is CAYRVSRDDKIIF (3.33% for Pt# 1) or CAWRDDKIIF (8.34% for Pt# 14). In the case of TRBV15-1-TRBJ2-1, the most abundant CDR3s is CATSRAYNEQFF for Pt# 1 (2.28%) and CATSRDLYGKNEQFF for Pt# 14 (3.08%) (Supplementary Table S2b).

TCR repertoire diversity correlates with relapse

In order to assess the TCR repertoire in HSCT patients with higher likelihood to have the GVL effect, we examined the correlation of the diversity index of TCR repertoire of samples collected at day 50, 75–100 or 1 year with relapse status. The TCR diversity in relapsed patients was not significantly different from that of non-relapsed patients (P=0.82 and 0.70, for TCRA and TCRB, respectively, Figures 4a and b). However, in the non-GVHD group, we found that the diversity index was significantly higher in non-relapsed patients compared to that in relapsed patients (85.73±16.59 vs 10.95±4.06 for TCRA; 85.69±14.77 vs 13.25±3.84 for TCRB; P=0.042 for TCRA and 0.029 for TCRB; Figures 4c and d). Consistently, the proportions of the 10 most abundant clones for both TCRA and TCRB were significantly higher in the relapsed patients of the non-GVHD group than in the non-relapsed patients (P=0.015 and 0.017 for TCRA and TCRB, respectively; Supplementary Figures 4a and b).

Figure 4
figure 4

Comparison of the TCR diversity between relapse and non-relapse patients. In all patients, the diversity of TCRA (a) and TCRB (b) of relapsed patients (n=7) were compared with that of non-relapsed patients (n=13). In non-GVHD patients, the diversity of TCRA (c) and TCRB (d) of non-relapsed patients (n=7) were compared with that of relapsed patients (n=3). In GVHD patients, the diversity of TCRA (e) and TCRB (f) of relapsed patients were compared with that of non-relapse patients.

On the other hand, relapsed patients in the GVHD group showed higher TCRA diversity (82.09±20.46) compared with the non-relapsed patients (29.28±8.10; P=0.012), while we only observed a similar trend in the case of TCRB (91.15±38.42 vs 33.77±8.51 for relapsed vs non-relapsed patients; P=0.08, Figures 4e and f). When we compared the proportion of the top ten CDR3 sequences between relapsed and non-relapsed patients in the GVHD group, we observed a modest but not statistically significant difference in both TCRA and TCRB (P=0.11 and 0.20 for TCRA and TCRB, respectively; Supplementary Figures 4c and d).

Discussion

Experimental models and clinical studies have suggested a link between GVHD and GVL. While GVHD remains the most challenging toxicity of allogeneic transplantation, efforts to mitigate the incidence of GVHD may substantially impact the development of GVL following the transplant. Understanding the balance between GVHD and GVL is particularly important in high-risk AML patients, where great efforts were made to modulate GVHD and enhance the GVL effect after the transplantation in order to reduce the relapse rates and improve clinical outcome for this group of patients. These approaches include manipulation of post-transplantation immune suppression, source of stem cells and cytokine use.22, 23

T-cell reconstitution has a major role in the recovery after the transplantation, in long-term resistance to infections, in the development of GVHD and also in the GVL effects. Therefore, to obtain better insight into the interplay between GVHD and GVL in transplanted patients, deeper understanding of the T-cell reconstitution and their function is essential. GVHD occurs when the donor-derived T-cells recognize some host Ag(s) presented on the HLA molecules at the cell surface as non-self and are activated to effector T-cells which further proliferate and cause damage to the host cells.3, 5 T-cells are important in driving the pathophysiology of GVHD, so assessment of T-cell repertoire should help in revealing the molecular mechanism associated with the development of this deleterious disease.

Several reports have examined the TCR repertoire after hematopoietic cell transplantation.14, 15, 24 Most of these studies focused on analyzing the TCRB only; here, we have examined both TCRA and TCRB which are required to form a TCR heterodimer for Ag recognition resulting in higher diversity of the TCR repertoire. Information of both TCRA and TCRB are useful for further functional analysis of TCRs. Additionally, unlike previous attempts that utilize the TCR-specific multiplex PCR to examine TCR repertoire, we applied 5’RACE PCR method to amplify all TCRA and TCRB transcripts using a single set of primers for each of TCRA and TCRB. The use of a single primer set can minimize the amplification bias that occurs by differences in Tm of PCR primers when using exon-specific multiplex primers.14, 15, 16 Furthermore, this approach permits the discovery of potentially novel exons which have not been deposited in the public database. Indeed, we found a transcript generated by recombination between TRDV1 (annotated as a V segment of TCR delta) and various J segments of TCRA in our samples, indicating this TRDV1 may function as one of V segments in the TCR alpha gene. This observation was also recently reported by another study.25

In the present study, we applied the NGS technology to comprehensively analyze TCR sequences at multiple time points in patients who received HSCT, and compared the changes in TCR repertoire in patients who received different sources of stem cells. The repertoire diversity became remarkably low in patients after the transplant. This reduced diversity after HSCT might be related to the conditioning regimen and/or post-transplant immune suppression. On the other hand, higher percentage of cord-blood cells even as early as at 30 days after transplant was correlated with higher TCRA and TCRB diversity thereafter, suggesting that patients with better cord-blood reconstitution might have a better and quick immune reconstitution. The correlation of the TCR diversity with higher percentage of cord-blood chimerism highlights the critical role of early monitoring of cord-blood cell percentage. This is consistent with previous studies showing that the rapid increase of cord-blood in haplo-cord transplant predicts later myeloid engraftment.26

Our data revealed that certain T-cells were clonally expanded in both GVHD and non-GVHD patients. However, the majority of GVHD patients, particularly in those without relapse, seemed to reveal much higher degree of the expansion of certain T-cells. In most of the GVHD cases, the diversity was significantly lower at the time point that was closest to the day of GVHD diagnosis, compared to that of any time points in the non-GVHD patients. A recent study claimed that higher diversity was observed in the severe GVHD cases, but this might reflect the difference in stem cell sources; the majority of severe GVHD cases in this study received double-unit umbilical cord-blood (DUCB) transplantation.27 Nevertheless, our observation was in accordance with two previous studies in which GVHD revealed lower diversity of T-cell repertoire.28, 29 It is reasonable to assume that the lower TCR diversity in the GVHD patients was caused by the stronger expansion of certain T-cell populations that might have key roles in the development of GVHD.

On the other hand, interestingly, in the non-GVHD group, non-relapsed patients had a relatively lower extent of enrichment of the 10 most abundant TCR sequences and higher diversity than relapsed patients. Since we observed that the higher diversity was correlated with the lower relapse rates, we speculate that the higher diversity in the TCR sequences may reflect the higher GVL effect and higher possibility for certain anti-leukemia T-cell clones to evolve over time. Therefore, characterization of the TCR repertoire at a later time point after transplantation may allow us to identify T-cell clones associated with the GVL effect. The lower diversity in the relapsed patients is also likely to reflect expansion of certain T-cell populations; as a previous study suggested that increased levels of regulatory T-cells might suppress the effect of GVL,30 higher levels of regulatory T-cells may confer less GVHD, but could reduce the GVL effect and increase the risk of relapse. On the contrary, we observed the lower diversity of TCR repertoire in the non-relapsed patients compared to relapsed patients in patients who developed GVHD. This may indicate an early evolving of the GVL effect in these patients. This is consistent with previous notions suggesting the lower relapse rate in patients with GVHD.2, 3

More interestingly, we observed an expansion of TRAV38-2/DV8-TRAJ30 and TRBV15-1-TRBJ2-1 in two acute GVHD patients who shared all of three major HLA class I alleles (HLA-A*02, HLA-B*07 and HLA-C*07) and two of the three major class II loci (HLA-DR*04, HLA-DQ*03 and HLA-DQ*06), but the expansion of T-cells having the same V(D)J combination was not observed in the remaining patients. In general, interaction between TCR and a peptide/MHC complex is mediated by the three CDR regions in each of TCR alpha and beta chains.30 CDR1 and CDR2 sequences are considered to be important in the binding to MHC. On the other hand, CDR3 defined as peptide sequences around the V(D)J-rearranged region is considered to be critical to interact with the specific peptide on the MHC molecule. Therefore, Ags recognized by TCRs on T-cells in these two GVHD patients may be different because of different CDR3 sequences. However, several studies suggested that distinct TCRs with different CDR3 (even if the alpha and/or beta chains are different) recognized the same peptide/MHC complex;32, 33, 34 for example, Davis-Harrison et al.34 reported that two different TCRs (different CDR3 sequences in the same TCR beta and different TRAV) could recognize the same Tax/HLA-A2. Hence, it is still possible that although CDR3 sequences are different, T-cells in the two patients might recognize the same Ag on the common MHC molecule.

In summary, our novel quantitative approach using the NGS platform has highlighted that higher cord-chimerism is associated with higher TCR repertoire diversity and that higher diversity early after the transplant possibly implies lower risks of both GVHD and relapse following the HSCT transplantation. Further studies should be performed to examine the role of different T-cell subsets including regulatory T-cells in the pathogenesis of GVHD and also in the relapsed patients. Detailed analysis of the functionally-relevant T-cell repertoire should shed light on better understanding of the significance of the GVL/GVHD interplay in T-cell response/reconstitution which holds essential clinical implications.