Abstract
Type 1 diabetes (T1D) is a heterogeneous disease with a slower evolution in individuals diagnosed at older ages. There are no validated clinical or laboratory biomarkers to predict the rate of insulin secretion decline either before or after the clinical onset of the disease, or the rate of progression to chronic complications of the disease. This pilot study aimed to characterize the proteomic and phosphoproteomic landscape of circulating extracellular vesicles (EVs) across a range of obesity in carefully matched established T1D and control subjects. We used archived serum samples from 17 human subjects (N=10 with T1D and N=7 normal healthy volunteers) from the ACME study (NCT03379792). EVs were isolated using EVtrap® technology (Tymora). Mass spectrometry-based methods were used to detect the global circulating EV proteome and phosphoproteome. Differential expression, coexpression network (WGCNA), and pathway enrichment analyses were implemented. The detected proteins and phosphoproteins were highly enriched (75%) in exosomal proteins cataloged in the ExoCarta database. A total of 181 differentially expressed EV proteins and 15 differentially expressed EV phosphoproteins were identified, with 8 upregulated EV proteins (i.e., CD63, RAB14, VCP, BSG, FLNA, GNAI2, LAMP2, and EZR) and 1 downregulated EV phosphoprotein (i.e., TUBA1B) listed among the top 100 ExoCarta proteins. This suggests that T1D could indeed modulate EV biogenesis and secretion. Enrichment analyses of both differentially expressed EV proteins and EV phosphoproteins identified associations of upregulated features with neutrophil, platelet, and immune response functions, as well as prion disease and other neurodegenerative diseases, among others. On the other hand, downregulated EV proteins were involved in MHC class II signaling and the regulation of monocyte differentiation. Potential novel key roles in T1D for C1q, plasminogen, IL6ST, CD40, HLA-DQB1, and phosphorylated S100A9, are highlighted. Remarkably, WGCNA uncovered two protein modules significantly associated with pancreas size, which may be implicated in the pathogenesis of T1D. Similarly, these modules showed significant enrichment for membrane compartments, processes associated with inflammation and the immune response, and regulation of viral processes, among others. This study demonstrates the potential of EV proteomic and phosphoproteomic signatures to provide insight into the pathobiology of type 1 diabetes and its complications.
Introduction
Type 1 diabetes (T1D) is a chronic disease characterized by hyperglycemia due to dysfunction and loss of β-cells and the consequent lack of endogenous insulin secretion. The β-cell loss of function and mass is due to a targeted autoimmune process against these cells, but the precise cause and pathological mechanisms are still largely unknown. Even if this description the pathogenic mechanism and evolution seem straightforward, the disease is heterogeneous with a slower evolution in individuals diagnosed at older ages (>18 years of age) and there are no clinical or laboratory biomarkers means to predict the rate of insulin secretion decline either before or after the clinical onset of the disease (1-3).
Once established, T1D is treated with subcutaneous insulin substitution to achieve blood and tissue glucose levels sufficiently low to prevent or limit the onset of micro and macrovascular chronic complications of the disease (4). However, there is a heterogeneity in the progression to chronic complications that is not completely explained by glycemic control. This is particularly relevant when considering the risk of cardiovascular disease in patients with T1D (5). The cardiovascular risk should be interpreted in the context of the obesity epidemic that does not spare the T1D population and it is possibly exacerbated by insulin treatment itself (6). Additional biomarkers are therefore needed to improve the diagnostic capacity, predict progression and risk of chronic complications, and evaluate treatment efficacy.
In recent years, attention have been directed to the rapidly evolving research area addressing the role of extracellular vesicles (EVs) as vehicles of intercellular communications. EVs in their different forms (i.e., exosomes and microvesicles) originating from virtually all tissues can be found in the peripheral circulation and have been studied as non-invasive disease biomarkers, particularly in oncology (7). They can contribute to the activation and regulation of physiologic and pathologic responses to β-cell stress or may contribute to the activation of the autoimmune process. On the other hand, they may serve as biomarkers of metabolic derangement, insulin resistance, and chronic complications in diabetes (8).
The use of the EVs as a biomarker in T1D requires a proper understanding of the effect of different clinical components of the disease on the characteristics of circulating EVs, particularly on the EV proteome. The interest in the EV proteome resides in the possibility of identifying disease-specific signatures that, even if not directly pathogenetic, may be used as predictors of specific phenotypes or disease progression.
In this cross-sectional pilot study, we describe the proteome of serum-derived EVs obtained from individuals with T1D who have been deeply metabolically phenotyped and compare it with matched healthy controls. Our ultimate goal was the identification of biomarkers that may improve our understanding of overweight and obesity in T1D. In addition, to gain insight into EV proteomic signatures associated with either pathogenesis-specific or hyperglycemia-related clinical characteristics, we implemented a novel approach based on weighted gene coexpression network analysis (WGCNA) of the EV proteome and an extensive list of clinical metabolic features.
Materials and Methods
Study population
The study was conducted according to the principles of the Declaration of Helsinki and followed GCP guidelines. All procedures were approved by the AdventHealth Translational Research Institute Institutional (AH/TRI) Review Board. Informed consent was obtained from all volunteers before initiation of the study. The goal of this pilot study was to characterize EVs across a range of obesity in carefully matched established T1D and Control groups. Archived serum samples from 17 human subjects (N=10 with T1D and N=7 healthy volunteers with normal glucose tolerance (NGT) from the ACME study (NCT03379792) were used. Participants were specifically selected as a subgroup that was well balanced for most characteristics. This selection was automated using custom script based on the MatchIt package (9) in the R programing environment.
In-depth clinical and metabolic phenotype
Anthropometric measures were performed according to standardized protocols. Fasting blood samples were obtained. Plasma glucose concentrations were measured using the glucose oxidase method with a YSI 2300 STAT Plus Analyzer (YSI Life Sciences, Yellow Springs, OH). HbA1c levels were measured using a Cobas Integra 800 Analyzer (Roche, Basel, Switzerland). Body composition was measured using a GE Lunar iDEXA whole-body scanner (GE, Madison, WI). Whole body magnetic resonance imaging (MRI) was implemented using a Philips 3T Achieva MRI instrument (Philips Medical Systems MR Inc., Latham, NY). Metabolic assessment in whole room calorimeters at AH/TRI according to standard methodology (10) allowed the simultaneous measure of total energy expenditure and substrate oxidation in a free-living environment with multiple activities. Continuous glucose monitoring (CGM) Dexcom G4 (Dexcom Inc., San Diego, CA, USA) was used to monitor glucose levels that were summarized according to what was proposed by Battelino et al. (11). A list of 154 variables included in the study is presented in Supplementary Table ST1.
EV isolation using Tymora’s EVtrap technology
Blood samples were allowed to coagulate at room temperature for 30 minutes after collection, then centrifuged at 1500 ×g for 15 min at 4 °C to produce serum. Serum was stored at -80 °C until use. Frozen serum samples were thawed, then spun at 2,500 × g for 10 minutes. The pre-cleared samples were then diluted 20-fold in PBS and incubated with EVtrap beads for 30 min (12). After supernatant removal using a magnetic separator rack, the beads were washed with PBS, and the EVs were eluted by a 10 min incubation with 200 mM triethylamine (TEA, Millipore-Sigma) and the resulting EV samples fully dried in a vacuum centrifuge.
Nanoparticle Tracking Analysis (NTA)
The size distribution and concentration of particles in EV preparations were analyzed using dynamic light-scattering technology with a NanoSight NS300 instrument and NTA-3.4 software (Malvern Panalytical, Malvern). The instrument was equipped with a 488 nm blue laser module, flow-cell top plate, integrated temperature control, and a single-syringe pump module. Samples were diluted using cell culture grade water (Corning cat# 25-005-CI) to produce an optimal particle concentration for final measurement in the range of 107 to 109 particles/ml. Final quantification included 5 standard measurements of 1 minute of duration each, taken at a controlled temperature of 25°C and under constant automatic flow. Camera level for video capture was set to 12 and detection threshold to 5 for all sample measurements.
Mass spectrometry (LC-MS/MS)-based methods used to detect the global EV proteome and phosphoproteome
The isolated and dried EV samples were processed as described previously (https://pubmed.ncbi.nlm.nih.gov/32396726/). Briefly, EV samples were lysed to extract proteins using the phase-transfer surfactant (PTS) aided procedure (13) and the proteins digested with Lys-C (Wako) at 1:100 (wt/wt) enzyme-to-protein ratio for 3 h at 37°C. Trypsin was added to a final 1:50 (wt/wt) enzyme-to-protein ratio for overnight digestion at 37°C. After surfactant removal, the resulting peptides were desalted using Top-Tip C18 tips (Glygen) according to manufacturer’s instructions. Each sample was split into 99% and 1% aliquots for phosphoproteomic and proteomic experiments respectively. The samples were dried completely in a vacuum centrifuge and stored at -80°C. For phosphoproteome analysis, the 99% portion of each sample was subjected to phosphopeptide enrichment using PolyMAC Phosphopeptide Enrichment kit (Tymora Analytical) according to manufacturer’s instructions, and the eluted phosphopeptides dried completely in a vacuum centrifuge. For phosphoproteomics analysis the whole enriched sample was used, while for proteomics only 50% of the sample was loaded onto the LC-MS.
Each dried peptide or phosphopeptide sample was dissolved at 0.1 μg/μL in 0.05% trifluoroacetic acid with 3% (vol/vol) acetonitrile. 10 μL of each sample was injected into an Ultimate 3000 nano UHPLC system (Thermo Fisher Scientific). Peptides were captured on a 2-cm Acclaim PepMap trap column and separated on a heated 50-cm column packed with ReproSil Saphir 1.9 μm C18 beads (Dr. Maisch GmbH). The mobile phase buffer consisted of 0.1% formic acid in ultrapure water (buffer A) with an eluting buffer of 0.1% formic acid in 80% (vol/vol) acetonitrile (buffer B) run with a linear 60-min gradient of 6–30% buffer B at flow rate of 300 nL/min. The UHPLC was coupled online with a Q-Exactive HF-X mass spectrometer (Thermo Fisher Scientific). The mass spectrometer was operated in the data-dependent mode, in which a full-scan MS (from m/z 375 to 1,500 with the resolution of 60,000) was followed by MS/MS of the 15 most intense ions (30,000 resolution; normalized collision energy - 28%; automatic gain control target (AGC) - 2E4, maximum injection time - 200 ms; 60sec exclusion].
Bioinformatic analysis
The raw files were searched directly against the human Uniprot database with no redundant entries, using Byonic (Protein Metrics) and Sequest search engines loaded into Proteome Discoverer 2.3 software (Thermo Fisher Scientific). MS1 precursor mass tolerance was set at 10 ppm, and MS2 tolerance was set at 20ppm. Search criteria included a static carbamidomethylation of cysteines (+57.0214 Da), and variable modifications of oxidation (+15.9949 Da) on methionine residues, acetylation (+42.011 Da) at N terminus of proteins, and phosphorylation of S, T and Y residues (+79.996 Da) for the phosphoproteomics data. Search was performed with full trypsin/P digestion and allowed a maximum of two missed cleavages on the peptides analyzed from the sequence database. The false-discovery rates of proteins and peptides were set at 0.01. All protein and peptide identifications were grouped and any redundant entries were removed. Only unique peptides and unique master proteins were reported.
All data were quantified using the label-free quantitation node of Precursor Ions Quantifier through the Proteome Discoverer v2.3 (Thermo Fisher Scientific). For the quantification of proteomic or phosphoproteomic data, the intensities of peptides/phosphopeptides were extracted with initial precursor mass tolerance set at 10 ppm, minimum number of isotope peaks as 2, maximum ΔRT of isotope pattern multiplets – 0.2 min, PSM confidence FDR of 0.01, with hypothesis test of ANOVA, maximum RT shift of 5 min, pairwise ratio-based ratio calculation, and 100 as the maximum allowed fold change. The abundance levels of all peptides and proteins were normalized using the total peptide amount normalization node in the Proteome Discoverer. For calculations of fold-change between the groups of proteins, total protein abundance values were added together and the ratios of these sums were used to compare proteins within different samples.
The EV-proteomic and phosphoproteomic expression profiles were analyzed with a user-defined bioinformatic procedure that included raw data preprocessing, differential expression analysis, weighted-gene correlation network analysis (WGCNA), and enriched functional analyses. Briefly, missing values in the expression profiles were imputed, and the data were then log2-transformed, and scale-standardized. The imputation was performed with random forest, an advanced machine-learning algorithm (14) known to be adaptive to interactions and nonlinearity with capability to handle mixed types of missing data (15). The normalized data were then analyzed using pvca (16) and limma (17), two R/Bioconductor software packages for biomarker discovery. Specifically, pvca, a package for principal variance component analysis, was first applied to identify significant covariates by fitting all “sources” as random effects, then linear models were created incorporating those significant covariates. Once established, the linear models were fitted using weighted least squares for each protein, moderated t-statistics, moderated F-statistic, log-fold changes and p-values of differential expression were calculated by empirical Bayes moderation of the standard errors. Finally, the Benjamin-Hochberg (BH) method was used to adjust the p-values. The network module analysis was performed with WGCNA, an R package for weighted gene correlation network analysis (18). In short, it first performs a weighted protein co-expression network analysis to find clusters of highly correlated proteins (modules), and then relates modules to the clinical measurements. Subsequent module membership, gene-trait significance and intra-module connectivity analysis was applied to identify the key driver proteins in modules of interest. KEGG pathway and Gene Ontology enrichment analyses were performed on the sets of differentially expressed proteins and phosphoproteins with clusterProfiler, an R package for enriched function analysis (19). Given a set of highly significant proteins or phosphoproteins identified, clusterProfiler suggests KEGG pathways and GO Ontology functional groups significantly affected.
Statistical analysis
Data normality was tested using the Shapiro-Wilk test, and nonnormal data was log-transformed to approximate normality. Differences in baseline clinical characteristics were assessed using the Welch two-sample t test (for continuous variables) or the Fisher exact test (for categorical variables). For assessment of differential expression in EV proteins and phosphoproteins, linear models using the limma package were implemented. Linear models included sex as covariate. Calculated effects and correlations with two-tailed P<0.05 and FDR<0.1 (for proteomics) or FDR<0.2 (for phosphoproteomics) were considered significant. False discovery rates (FDR) correcting for multiple testing were calculated using the Benjamini-Hochberg correction as implemented for the p.adjust function in the stats package.
Results
Study design and clinical characteristics of the study cohort
A balanced subset of 17 participants from the ACME study was selected for this pilot proteomic and phosphoproteomic study. The subset was not confounded by differences in variables known to affect metabolic function (e.g., age, BMI, among others). Archived sera from these participants were available for EV purification and proteome/phosphoproteome analysis. The average age of the participants was 25.7 years (range 20.1-34.5), 11/17 were females, and the average BMI was 26.8 kg/m2 (range 19.5-38.6) for the whole population. The comparison of 154 anthropometric, clinical, and metabolic characteristics for the study cohort are presented in Supplementary Table ST1.
EV purification and characterization demonstrate enrichment in exosomal particles
The EVtrap efficiently isolated circulating particles with a distribution of sizes compatible with exosomes (Figure 1A). There was a trend (p = 0.07) to higher modal particle sizes in the T1D group (the mode of particle size was 151.5 nm in T1D compared to 137.7 nm in controls). However, the average particle size was comparable between the groups (mean size 209.5 nm in T1D and 201.3 nm in controls, p = 0.159). Similarly, the average particle concentration did not differ between the two groups (3.6×1010 particles/ml in T1D and 4.6×1010 particles/mL in controls, p = 0.346) (Figure 1A). LC-MS/MS of isolated EVs from participant sera identified 1950 EV proteins in 1 ml of serum, which included 1467 (75.2%) proteins reported in ExoCarta (a database collecting information on all exosomal proteins, RNA, and lipids (20)), and more specifically, 91 (91%) of the top 100 exosomal proteins reported in this database (Figure 1B). Similarly, LC-MS/MS identified 561 phosphoproteins in the circulating EVs, which included 421 (75%) of ExoCarta proteins (Figure 1B). These results indicate that the isolated proteome and phosphoproteome are enriched for small EVs (exosomes).
The serum EV proteome and phosphoproteome are different in people with T1D compared to controls
To further characterize the circulating EV cargo in our study population, we conducted additional bioinformatic analyses including important variable analysis with random forests (RF), differential expression analysis, and unsupervised clustering analysis. A total of 181 differentially expressed EV proteins were identified with, at least, a 2-fold change (p-value ≤ 0.01, FDR<0.1). Of these, 135 were upregulated and 46 were downregulated in T1D, as compared to controls (Table 1, Figure 1E). On the other hand, 15 EV phosphoproteins were more than 1.4-fold different between the groups (p-value < 0.05, FDR < 0.2), with 7 of them being upregulated and 8 being downregulated (Table 2, Figure 1F). Notably, of the top 100 exosomal proteins reported by ExoCarta, which are considered the best markers of exosomes, 8 (i.e., CD63, RAB14, VCP, BSG (a.k.a. CD147), FLNA, GNAI2, LAMP2, EZR), were significantly upregulated (absolute fold change FC ≥ 2, P < 0.01, FDR < 0.1), while 1 phosphoprotein (i.e., TUBA1B) was downregulated (FC = -1.75, P = 0.0027, FDR = 0.13) in the circulating EVs from the T1D group. Interestingly, the EV proteome and phosphoproteome showed a relatively small overlap, in general. Only a single protein (S100A9) was differentially expressed (i.e., upregulated) at both the total protein and the phosphorylated state (Figure 2G). All other differentially expressed proteins/phosphoproteins were only differentially expressed at either the total protein or the phosphorylated protein level. S100A9 displayed a positive trend (r = 0.42, P = 0.093) with the number of circulating neutrophils (Figure 1H).
As pathway analysis techniques can help in interpreting proteomics results, we applied various enrichment analysis tools to obtain functional information on the differentially expressed proteins. The Gene Ontology (GO) cellular compartment enrichment analysis confirmed that the differentially expressed proteins primarily belong to pathways associated with vesicles and granules: secretory granule lumen, cytoplasmic vesicle lumen, vesicle lumen, membrane raft, membrane microdomain, membrane region indirectly confirming the vesicular origin of the proteins (Figure 2A). GO biological process enrichment analysis on the differentially expressed EV proteins showed that the differentially upregulated proteins are involved in neutrophil degranulation and activation, immune response, platelet degranulation, blood coagulation, and hemostasis, among others (Figure 2B, Module Pr.1). On the other hand, downregulated EV proteins were enriched in proteins involved in the regulation of monocyte differentiation (Figure 2B, Module Pr.2). The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis also identified platelet activation as one of the most represented pathways together with leukocyte transendothelial migration, prion disease and neurodegenerative disorders (Figures 1C).
The differentially expressed EV phosphoproteins were similarly enriched for proteins found in vesicular compartments such as late endosomes and secretory granules apparently involved in insulin response and platelet function (Figure 3A). On the other hand, enriched GO biological processes suggest the upregulated EV phosphoproteins are involved in the detoxification of cellular oxidants (Figure 3B). The KEGG pathway analysis demonstrated enrichment in upregulated EV phosphoproteins involved in insulin resistance and diabetes (Figure 3C, Module Ph.1), while downregulated EV phosphoproteins were involved in phagosome function, prion disease and neurodegenerative disease pathways (Figure 3C, Module Ph.2).
Weighted gene co-expression network analysis identified key EV-associated protein modules that correlate with pancreas size
To further improved our understanding of the role of circulating EVs in T1D at the system level, we applied the weighted gene co-expression network analysis (WGCNA) methodology for the analysis of all proteins identified in circulating EVs. This method focuses on the identification of modules that include features (i.e., EV proteins in our case) with correlated expression patterns as opposed to the identification of individual genes based on group average comparisons (as is the case of differential expression analysis), therefore, alleviating the multiple statistical testing problem. A Pearson correlation matrix of the proteins is used to form a hierarchical dendrogram that is then cut into branches corresponding to modules. Each module includes genes with similar expression pattern and most likely specific biological functions. The module eigengenes are also correlated with external clinical traits, consequently generating a denser mechanistic overview of co-regulated EV proteins associated with the underlying T1D biology. The WGCNA methodology has been previously used for proteomic analysis (21, 22).
Figure 4A presents the co-expression dendrograms for all EV proteins and how they were grouped in 14 color-coded modules. Figure 4B presents the number of EV proteins included in each module. The analysis identified significant associations of specific EV protein modules with 9 of the 154 clinical characteristics included. Primarily, the association was with indicators of metabolic compensation and glucose levels, but other interesting traits were identified as an index of insulin sensitivity (eIS), pancreas size and resting energy expenditure. The strongest correlations were seen with clinical traits associated with blood glucose levels (metrics obtained from the continuous glucose monitoring, HbA1c) but also with other clinical characteristics indicative of T1D (pancreas size) or metabolic aspects known to be modified in T1D (Figure 4C).
Three co-expression modules (i.e., Pink, Tan, and Yellow modules) were significantly associated with elevated blood glucose levels, expressed as either percentage of continuous glucose monitoring (CGM) above 250 mg/dl or between 180 and 250 mg/dl, average glucose level during CGM, HbA1c. As this is the characteristic feature of diabetes it is hard to discriminate whether the EV proteins included in these modules are differentially abundant as a cause or a consequence of hyperglycemia (Figure 4C). The Tan module was of particular interest because, in addition to its association with blood glucose levels (correlation 0.9 with % CGM reading >250 mg/dl, p =2e-08), it is associated with pancreas size (correlation -0.71, p=0.002) and estimated insulin sensitivity calculated from non-fasting parameter by the Coronary Artery Calcification in Type 1 Diabetes (CACTI) study (eIS-nf) (23).
Four modules (i.e., Tan, Salmon, Pink, Purple, Figure 4C and Figure 5) showed significant association with pancreas size that may be a pathogenic clinical feature of T1D (24-27). Heatmaps of select module proteins (36 of 41 in the Tan module and 19 of 40 in the Salmon module) that displays strong correlations between module membership and gene significance demonstrate the specificity of the EV protein-pancreas size associations with T1D (Figure 5C,D), particularly for the Tan module EV proteins. The Pink module also showed a significant correlation between module membership and gene significance. However, the association was not significant for the purple module (Figure 5A). The GO cellular component enrichment analysis for the Tan module showed a significant enrichment for membrane raft, membrane microdomain, and membrane region compartments as the highest positively correlated components. Among the enriched GO biological processes, several processes associated with inflammation and the immune response were identified (Figure 5E). The EV proteins belonging to the Salmon module and associated with pancreas were enriched for humoral immune response, regulation of viral entry into host cells, modulation by symbiont of entry into host, regulation of viral life cycle, movement in host environment, interaction with host, regulation of viral processes (Figure 5F) that might be relevant in T1D pathogenesis or response to the environment.
Considering the proteins with the strongest correlation with pancreas size in the two modules, we noticed that they are continuously distributed rather than separated in two groups corresponding to T1D and controls (Figure 6). This could indicate that the correlation is with the organ size rather than being mediated by a confounder that would discriminate people with diabetes such as glycemic control. The Salmon module identified proteins strongly correlated with the clinical trait of interest (i.e., pancreas size) but not appearing among the significantly differentially expressed proteins. Two of these are HLA-DRB1 and CD74 (stabilizes the MHC Class II antigen processing and prevents loading antigen peptides into the MHC Class II complex). Both HLA-DRB1 and HLA-DQB1 in serum EVs are positively associated with pancreas size and are downregulated in T1D compared to controls. This supports the concept that more sophisticated analyses are needed to understand the complex picture of circulating EV proteins.
Discussion
This pilot study showed that the circulating EV proteome significantly correlates with key features of type 1 diabetes. Rather than looking at freely circulating peripheral blood biomarkers, our study focused on the proteomic cargo in circulating EVs. The EV cargo is reported to mirror, to some extent, the specific cellular phenotypes from their cells of origins (28). Therefore, it could provide information on mechanisms that are active in the disease of interest. The importance of EV proteins as disease biomarkers has been highlighted in the cancer field (29). To our knowledge, this is the first characterization of the proteome of circulating EVs in patients with T1D compared to controls. The plasma or serum proteome includes a heterogenous pool of EVs derived by multiple cell types, with production driven by multiple mechanisms (30, 31). We would expect, therefore, that the EV proteome and phosphoproteome would be affected by hyperglycemia, insulin treatment and glucose variability as well as T1D pathogenic mechanisms. Our proteomic and phosphoproteomic analyses of serum EVs isolated using the EVtrap technology demonstrated significant enrichment in annotated exosomal proteins and phosphoproteins. This suggests that our study preparations were likely enriched in exosomes.
Supporting the information richness and biomarker potential of circulating EVs, our bioinformatic approaches were able to discriminate the two study populations, namely people with T1D and controls, and to identify key EV proteins that represent aspects of validated and novel T1D biology. By analyzing cellular compartment annotations for the differentially expressed EV proteins and phosphoproteins, we demonstrated enrichment in vesicle and secretory granule associated proteins likely involved in insulin, platelet, and neutrophil function. This leads us to the hypothesis that T1D could indeed modulate EV biogenesis and secretion. As expected, biological process and pathway analysis identified enrichment for pathways (e.g., neutrophil degranulation and activation) that may be associated not only with the pathogenesis of the disease but also with the development of diabetic chronic complications, both at the macro and microvascular level. Pathways connected to platelet function and coagulation call for the possible use of the EV proteomics in the prediction of disease complications. Specifically, C1q and plasminogen (PLG) were two of the key complement proteins strongly increased in our T1D EV preparations. Notably, deposition of complement proteins like C1q in the EVs has been reported to be needed for complement activation and, when deregulated, to contribute to pathological states (32). These particles loaded with complement regulators are suggested to contribute to the stimulation and inhibition of T-cell responses and the interaction among immune cells (33). Furthermore, C1q has been recognized as an important pattern recognition receptor that diverts autoantigen containing extracellular vesicles from immune recognition (34). PLG, on the other hand, is reported to be more efficiently activated by PLG activators when associated with cell surfaces, as compared to the reaction in solution (35). By the same token, PLG on vesicle surfaces may be more efficiently converted into plasmin, consequently promoting fibrinolysis and immune cell modulation with systemic reach.
Additionally supporting the concept that T1D could modulate EV biogenesis and secretion, 8 of the top 100 ExoCarta proteins (considered the best exosomal markers) were found to be significantly upregulated in the T1D circulating EVs. Among those proteins, CD63, a classical surface marker of exosomes, has recently being implicated as a key molecule that determines the destination of nascent insulin granules towards degradation via a novel lysosomal-mediated mechanism operative in pancreatic β cells (36). This CD63-requiring (metabolic) stress-induced nascent granule degradation (SINGD) pathway contributes to loss of insulin, β cell failure and the pathogenesis of diabetes, and its inhibition delays the onset of diabetes (36). Notably, LAMP-2, another T1D upregulated top100 ExoCarta EV protein, is also reported to interact with proteins related to glucose and lipid metabolism (37) and implicated in CD63-mediated SINGD (36). RAB14, on the other hand, has been implicated as a key regulator of GLUT4 sorting to specialized transport vesicles that translocate the glucose transporter to the plasma membrane, therefore regulating adipocyte glucose uptake (38). Filamin-A (FLNA), a protein involved in actin filament cross-linking, cell growth, and motility, has been recently implicated in insulin and IGF1 signaling (39). BSG/CD147, a membrane-bound glycoprotein involved in energy metabolism, has also been reported to be induced in monocytes by high glucose and to possibly play a role in diabetic complications (40).
Among other top upregulated EV proteins involved in the inflammatory response and immune-activation we found IL6ST/Gp130 (the founding member of the cytokine receptor family, which is involved in the regulation of adipocyte development and function and has a key role at the intersection of inflammation, autoimmunity and cancer) (41, 42) and CD40, and the HLA-DQB1 among the downregulated proteins. Of note, all these proteins are involved in IL-6 signaling (41-47). Importantly and supporting our findings, it is known that macrophages from T1D patients with high-risk HLA-DQB1 alleles are sensitized to secrete IL-6 in response to nonantigenic stimulation (43).
In addition to the neutrophil, platelet, and immune cell activation related pathways, the differentially expressed proteins and phosphoproteins were enriched for other biological processes and KEGG pathways related to prion disease and multiple neurodegenerative diseases. In this context, it is of interest to highlight that recently, the Wasserfall group (48) showed an increased frequency of cellular prion proteins in pancreas specimens from people with T1D and that, earlier, Strom and collaborators (49) demonstrated in two rodent models that altered metabolism of cellular prion protein in β cells associated with glucose dysregulation.
Only one of the differentially expressed EV proteins was also significantly differentially expressed in its phosphorylated state in circulating EVs. In this respect, S100A9, the only feature demonstrating significant differential expression (upregulation) at both the proteome and phosphoproteome level, is a determinant of the epigenetic regulation and activation of the monocyte-macrophage system in hyperglycemia (50). S100A9 in association with S100A8, form the heterocomplex known as calprotectin that is suggested to activate the transendothelial accumulation of monocytes at the site of inflammation, which associates with chronic inflammatory diseases (51, 52). S100A9 by itself is also reported to stimulates neutrophil adhesion by activating the CD11b/CD18 β2 integrin (53). Remarkably, only the phosphorylated form of S100A9 from neutrophils was shown to be essential to induce proinflammatory cytokine secretion by extracellular S100A8/phosphoS100A9 via toll-like receptor 4 signaling (54). More recently, Liao and collaborators demonstrated that S100A9 was overexpressed in pancreatic cancer-associated diabetes (PCDM) and that, together with Galectin-3, caused insulin resistance and distinguished PCDM from type 2 diabetes in subjects with new-onset diabetes (55). We now show that these processes may be additionally mediated by the presence of upregulated levels of phosphorylated S100A9 in circulating EVs.
Taking advantage of the deep metabolic phenotyping of our participants, we attempted at more precisely characterize the EV proteome and its association with clinical features (beyond the analysis of differential expression based on group averages comparisons) by applying a WGCNA methodology. Coexpression networks are powerful tools used in bioinformatics and WGCNA is a method that identifies clusters of genes based on their correlated expression patterns. The method allows for the calculation of module membership and gene significance measures relevant to the identification of biomarkers of interest associated with specific clinical variables. WGCNA has been applied in various biological contexts and “can be used to generate testable hypotheses for validation in independent data sets” (56). Thus, it aligns with the objective of our pilot study. This approach can find potential novel candidates based on co-expression similarities, rather than focusing exclusively on databases of protein-protein interaction that may have limitations due to the heterogeneity of experiments and model organisms (57). Therefore, the adaptation WGCNA on proteomic data seems particularly suitable to study EV proteomics where the literature is relatively limited and only applied to urinary EVs (58, 59).
WGCNA applied to EV proteomics in our study identified 14 distinct modules significantly associated with nine clinical features. The association is primarily with blood glucose measures indicative of T1D, but also with other features like pancreas size, resting energy expenditure, and the CACTI index of insulin sensitivity. The enrichment analysis of gene ontologies pointed out to the biological meaning of the modules, identifying neutrophil activation and degranulation, activation of immune response and blood coagulation as some of the pathways associated with hyperglycemia. Humoral immune response, regulation of viral entry and life cycle regulation of viral processes were associated with pancreas size in the Salmon module together with mast cell activation, platelet activation, blood coagulation, hemostasis, and response to IL-7 in the Tan module.
The WGCNA methodology has been applied to peripheral blood gene expression in T1D and controls (60, 61) and identified significantly disrupted co-expression modules in T1D compared to healthy control PBMCs. The module associated with T1D in Lu et al. work (60) was enriched for genes belonging to the “regulation of immune response” pathway. Our study took a broader approach considering EV proteomics. Circulating EVs originate from different cells, therefore, the modules that we identified are likely a composite of coregulated protein expressions from a number of different cellular sources, including PBMCs. In line with what was seen in PBMC transcriptomics, we also identified enrichment in proteins classified under the broad biological process ontology denominated “regulation of immune response” (that includes 1019 genes) and associated with hyperglycemia as an indicator of diabetes (regulation of Fc receptor mediated stimulatory signaling pathway). In our approach, we include proteomic profiles obtained from patients with T1D and healthy controls to identify clinically-relevant protein clusters (modules) that correlate with clinical variables of interest, independently of the disease. Therefore, the proteins included in the modules of interest could be used as markers of that clinical variable rather than of the disease. The analysis focused then on the association with pancreas size, as a trait that may be associated with disease pathogenesis.
Among the proteins of interest identified by our traditional approach (differential expression) and by WGCNA, HLA-DRB1 and DQB1 are downregulated on EVs. MHC-Class II molecules are present in exosomes originating from APC, B-cells and T-cells (62-64). Consistent with our findings, MHC Class II-loaded EVs are also detected in plasma (65). The MHC Class II in EVs is the predominant form of MHC Class II detected in serum and plasma, even though a soluble form has been described. Both the soluble MHC-Class II and the exosome-associated MHC-Class II are immunomodulatory (66). Immunosuppressive EVs derived from APCs occurs naturally, for example, soon after eating or inoculation of specific antigens (67). The experiments by Kim and collaborators (65) suggested that the immune response to a foreign antigen is regulated by exosomes in plasma produced by monocytes/macrophages that have the ability to suppress the immune response in an antigen-specific manner. Therefore, we could hypothesize that lower levels of EV MHC in T1D compared to healthy individuals could be a specific sign of the loss of tolerance in T1D. Our study is not able to demonstrate that this mechanism is active in T1D pathogenesis, however, it highlights the power of WGCNA in analyzing complex diseases and in formulating hypothesis. Paired with the in-depth phenotyping of the participants of this pilot study, WGCNA allowed us to identify and confirm the potential role of selected proteins that would have been excluded by simple differential expression analysis.
The primary limitation of our study is the small sample size. The power of the applied methodology is that it produces meaningful results even with a small sample size as in our case. The gene connectivity analysis smooths the variability and alleviates the need for multiple sampling. The design needs to be considered hypothesis generating and the findings should be confirmed in larger and independent cohort(s), and ultimately in dedicated functional studies.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Funding
This study was funded by program funds granted to REP by the AdventHealth Translational Research Institute. The ACT1ON Study was funded by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under Award Number 1DP3DK113358; MPIs Mayer-Davis, Maahs, Pratley. DMM is supported by P30DK116074. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Author contributions
A.C. designed and supervised experiments, researched data, and wrote the manuscript; Y.O.N.L. conducted experiments, researched data, and wrote the manuscript; G.Y. researched data and contributed to manuscript writing; C.C. researched data and contributed to manuscript editing; A.B, contributed to clinical study execution and researched data and contributed to manuscript editing, A.M.P. conducted experiments and contributed to manuscript editing; H.C. conducted experiments and contributed to manuscript editing; K.C. conducted experiments and contributed to manuscript editing, A.I. conducted experiments, researched data, and contributed to manuscript editing; D. M. designed the larger clinical study, acquired funding, and reviewed the manuscript, E.J. M-D designed the larger clinical study, acquired funding, and reviewed the manuscript, R.E.P. designed the study, acquired funding, provided the study samples, supervised the work, and reviewed/edited the manuscript.
Declaration of competing interests
The authors declare the following competing financial interest(s): A.I. is a principal at Tymora Analytical Operations, which developed the EVtrap beads and commercialized PolyMAC enrichment kit. R.E.P. reports grants from Hanmi Pharmaceutical Co.; grants from Janssen; consulting fees from Merck; grants, speaker fees and consulting fees from Novo Nordisk; consulting fees from Pfizer; grants from Poxel SA; grants and consulting fees from Sanofi; consulting fees from Scohia Pharma Inc.; consulting fees from Sun Pharmaceutical Industries. A.C. reports consulting fees from GlaxoSmithKline. Honoraria and fees for REP’s and AC’s services were paid directly to AdventHealth, a nonprofit organization. DMM has consulted for Abbott, Medtronic, the Helmsley Charitable Trust, Sanofi, Novo Nordisk, and Eli Lilly and has served on an advisory board for Insulet. EJM-D has consulted for Helmsley Charitable Trust. No other potential conflicts of interest relevant to this article were reported.
Acknowledgements
The authors thank all ACME participants and AH/TRI staff.