Proteomic analysis of circulating immune cells identifies novel cellular phenotypes associated with COVID-19 severity

Certain serum proteins, including CRP and D-dimer, have prognostic value in patients with SARS-CoV-2 infection. Nonetheless, these factors are non-specific, and provide limited mechanistic insight into the peripheral blood mononuclear cell (PBMC) populations which drive the pathogenesis of severe COVID-19. To identify novel cellular phenotypes associated with disease progression, we here describe a comprehensive, unbiased analysis of the total and plasma membrane proteomes of PBMCs from a cohort of 40 unvaccinated individuals with SARS-CoV-2 infection, spanning the whole spectrum of disease severity. Combined with RNA-seq and flow cytometry data from the same donors, we define a comprehensive multi-omic profile for each severity level, revealing cumulative immune cell dysregulation in progressive disease. In particular, the cell surface proteins CEACAMs1, 6 and 8, CD177, CD63 and CD89 are strongly associated with severe COVID-19, corresponding to the emergence of atypical CD3+CD4+CD177+ and CD16+CEACAM1/6/8+ mononuclear cells. Utilisation of these markers may facilitate real-time patient assessment by flow cytometry, and identify immune cell populations that could be targeted to ameliorate immunopathology.


Introduction
SARS-CoV-2 continues to present a public health crisis due to a slow global rollout of vaccination programmes and emergence of novel virus variants. Viral pathogenesis comprises an initial stage of virus replication followed by immune cell recruitment and cytokine production. Most infected individuals generate an effective immune response that achieves viral clearance without excessive tissue damage, presenting with no or mild symptoms. However, a minority experience severe disease, driven by a dysregulated and hyperactive immune response and characterised by high levels of proinflammatory mediators such as IL-6 and TNF 1,2 . The consequent enhanced vascular permeability, thrombosis and tissue damage can lead to severe pneumonia, acute respiratory distress syndrome, multiple organ failure and death 3,4 .
Several studies have aimed to define perturbations in circulating immune cell subsets and secreted factors during severe COVID-19, utilising techniques such as bulk and single cell transcriptomics, cytometry panels and plasma proteomics. These have shown that severe COVID-19 is associated with profound peripheral lymphopenia 5-8 charactersised by recruitment of NK cells to the lung tissue from peripheral blood, and expansion of activated yet functionally impaired 9,10 inflammatory NK cells expressing cytotoxic factors and interferon-stimulated genes [10][11][12] . Unusually for a viral infection, profound peripheral neutrophilia is also observed 13,14 due to release of immature, inflammatory neutrophils via emergency myelopoiesis and a hyperinflammatory phenotype reminiscent of bacterial sepsis [15][16][17] . Circulating CD4 + and CD8 + T-cells are depleted 5,7,8,18 , concurrent with expansion of activated effector cell subpopulations 9,19 , and express high levels of exhaustion markers such as PD-1 and TIM3 9,19,20 . Significant changes are observed in the myeloid compartment, such as depletion of nonclassical CD16 + monocytes 17,21,22 and expansion of CD14 + HLA-DR low cells resembling immunosuppressive monocytes observed in sepsis 17 . In addition, platelets exhibit a hyperactivated phenotype defined by expression of the activation marker CD62P (P-selectin) and readily form prothrombotic platelet-leukocyte aggregates [23][24][25] .
A comprehensive understanding of the immune dysregulation and immunopathology that underpins COVID-19 is essential to identify patients at risk of progressing to severe disease in order to provide early therapeutic intervention. Here, we performed a detailed, unbiased proteomic analysis of the peripheral blood mononuclear cell (PBMC) plasma membrane and cellular proteomes from seven healthy controls and 33 unvaccinated individuals with acute SARS-CoV-2 infection across the spectrum of COVID-19 disease. These data complement previous characterisation of the same cohort by whole blood transcriptomics and cytometric phenotyping 5 alongside PBMC single-cell sequencing 26 . We identified progressive upregulation of a novel group of proteins expressed by multiple immune cell Identification of cellular phenotypes associated with disease severity by mass-spectrometry 7704 proteins were quantified in the WCL dataset and 597 proteins were quantified in the PM dataset.
Statistical analyses indicated that there were no significant differences in protein abundance between donors with asymptomatic or mild symptomatic disease (classes A & B) and these data were therefore combined for further analysis. In whole cell lysates, differential protein expression was most pronounced in severe COVID-19 (classes D & E). Whereas 45 and 82 proteins were upregulated >2fold relative to healthy controls in class AB and C donors, strikingly, 278 and 392 proteins were . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint upregulated >2-fold in classes D and E, respectively. These included several proteins previously identified as upregulated in severe COVID-19: C-reactive protein (CRP), complement component 9 (C9) and lipopolysaccharide binding protein (LBP) (Figures 2A-B). Gene Ontology (GO) analysis identified significant enrichment of terms associated with antiviral defence in class AB donors, including upregulation of multiple interferon stimulated genes such as IFITs, Mx proteins and OAS proteins. By comparison, proteins highly upregulated in classes D & E were enriched in functions related to antimicrobial and antibacterial defence ( Figure 2B, Table S2). Data from all proteomic experiments in this study are shown in Table S3. Here, the worksheet ''Lookup'' is interactive, enabling generation of graphs of expression of any of the proteins quantified.
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint were derived from data averaged across all donors from a given disease severity class in comparison to average data from healthy controls analysed in the same mass spectrometry experiment. Enlargements of two subclusters are shown, highlighting groups of proteins that were upregulated in classes D and E vs AB (top panel) or in classes AB vs E (bottom panel). DAVID enrichment terms and corresponding Benjamini-Hochberg-corrected p-values are shown for each cluster.
(C) Hierarchical cluster analysis of 522 plasma-membrane-enriched proteins quantified in both PM analyses. Enlargement of one subcluster is shown, identifying a group of proteins that are highly upregulated at the PBMC cell-surface in severe COVID-19. Proteins that best discriminate between mild and severe disease on the basis of principal component analysis ( Figure S1) are highlighted in red.
To identify cell surface phenotypes associated with COVID-19 severity, which could also be readily utilised in diagnostic or therapeutic settings, equivalent analyses were conducted on PM data ( Figure   2C). Overall, 54 and 49 proteins were upregulated >2-fold in classes D and E versus healthy control, including a cluster of proteins that were specifically upregulated in severe, but not mild, disease  (Figures S1A-B). Analysis of PCA loadings revealed that a substantial proportion of the variation between classes resulted from changes in CEACAM proteins, CD177, CD63 and CD89 (Figures S1C-D). Further analysis of protein profiles for each of these candidate markers identified a consistent and significant upregulation in marker abundance with increasing disease severity in both WCL and PM data (Figure 3). This trend was also observed for corresponding genes in parallel samples analysed by whole-blood RNA-seq by Bergamaschi et al 5 (Table S1B), further supporting the proteomic conclusions (Figure 3). As a result, this group of markers were selected for further validation and characterisation.
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Protein and RNA expression data for candidate markers identified in our analysis as being upregulated in PMBC from individuals with severe COVID-19. RNA expression data at the earliest available timepoint was extracted from previous whole-blood RNAseq analysis of a larger cohort including our donors 5 and expressed in Log2(RPKM). Ordinary one-way ANOVA with Tukey's multiple comparisons post-hoc test on Log2-transformed data: *p<0.05, **p<0.005, ***p<0.0005, ****p<0.0001.
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint  Of note, CEACAM8 and CD177 are classically regarded as neutrophil markers. As neutrophilia is a hallmark of severe COVID-19, we first determined whether observations resulted from neutrophil contamination of PBMC samples, despite exclusion of granulocytes during PBMC density gradient isolation. PBMC immunophenotyping flow cytometry data were reanalysed for all donors in the larger cohort from which proteomic samples were drawn (Table S1C) 5 . Mature neutrophil contamination was only present in a single sample, indicating an absence of confounding systemic contamination ( Figures S1E-F). Notably, the one donor with identified mature neutrophil sample contamination (CV0144, class E) did not exhibit greater upregulation of CEACAM8 or CD177 versus other class E donors, suggesting that upregulation of these markers is neutrophil-independent.

Validation and phenotyping of identified markers by flow-cytometry
To verify candidate markers and to phenotype cell populations expressing these proteins, multicolour flow cytometry was performed on a cohort of 36 donors across the range of disease severity classes ( Figures 4A, S2A-B, Table S1D). Corresponding to previous observations, this revealed a progressive and significant decrease in lymphocytes with worsening COVID-19, concurrent with a significant increase in platelet abundance and no overall trend in the myeloid compartment ( Figure S2C).
Detailed phenotyping identified further changes associated with disease severity in specific subpopulations, including depletion of CD56 + NK cells, CD3 + CD4 + and CD8 + T-cells and increases in both resting CD62Pand activated CD62P + platelets (Figure 4B), again consistent with previously documented changes in circulating immune cells during acute COVID-19 [6][7][8]17 . A trend towards depletion of CD3 + γδTCR + T-cells and increased abundance of CD16 + non-classical monocytes was also observed, although due to sample limitations it was not possible to collect sufficient events required for statistical significance. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ;  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint (C) Heatmap of biomarker expression across immune cell subsets, expressed as log2 (fold change in geometric mean fluorescence intensity relative to healthy controls). p-values: as Figure 4A. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; significantly upregulated on patients with moderate versus mild disease, indicating progressive upregulation of this marker with increasing disease severity. The frequency and intensity of CD177 expression was upregulated in severe disease on both lymphocytes and platelets, CD89 on lymphocytes alone and the intensity of CD63 and CD89 expression was significantly increased on platelets (Figures 4A, S3A).
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint

Discussion
Understanding the complex immunobiology of COVID-19 is essential in developing predictive measures both of the severity of acute disease and to predict the development and progress of long COVID. This knowledge will also be vital to understanding efficacy of novel therapies. Here we present a searchable analysis of the PBMC cellular and plasma membrane proteomes during acute SARS-CoV-2 infection in a cohort of donors spanning the spectrum of COVID-19 disease. Our data indicate a profound shift in PBMC proteome profiles from mild to severe disease, echoing observations made in the whole blood transcriptome 5 and plasma proteome 30 and reflecting the significant remodelling of circulating immune cell composition during COVID-19. Notably, we observed highly significant enrichment of terms related to microbial defence among cellular proteins upregulated during severe disease. Patient metadata indicated that only a small number of patients in classes D and E had confirmed secondary infections, further corroborating the emergence of a sepsis-like state driving COVID-19 immunopathology 17 . In addition, selective upregulation of canonical interferon-stimulated genes such as the IFIT and Mx families was observed in patients with mild disease. Of note, timenormalised transcriptomic analysis of the wider cohort from which our samples derived 5 suggested that PBMC from patients with severe disease that are collected relatively early after symptom onset can also exhibit upregulated ISG expression which subsequently wanes. We observed a similar trend for donors sampled in the earlier phases of infection (Tables S1, S3).
Unbiased profiling of the PBMC plasma membrane proteome identified a unique signature of severe COVID-19, marked by the upregulated expression of a group of proteins with immunoregulatory functions: CEACAM1, 6 and 8, CD177, CD63 and CD89. CD177 is a glycosyl-phosphatidylinositol (GPI)linked surface glycoprotein that is canonically regarded as a neutrophil marker 31 , although expression on monocytes has also been reported 32 . CD177 plays characterised roles in mediating neutrophil endothelial transmigration via binding to PECAM1 33 in addition to activation and degranulation as part of a CD177/PR3-CD11b/CD18 complex 34,35 . CD63 is a ubiquitously expressed member of the tetraspanin membrane protein family involved in cell adhesion and intracellular trafficking 36 . Typically localised to intracellular compartments, cell surface CD63 is used as a marker of platelet 37 and T-cell 38 activation in addition to granulocyte degranulation 39,40 . CD89 is an Fc receptor expressed on neutrophils and monocytes 32 that binds to IgA immune complexes 41 and CRP 42 , initiating cell activation, cytokine release 43 and has a role in protecting against bacterial sepsis 44 .
CEACAMs 1, 6 and 8 belong to a family of immunoglobulin-like surface glycoproteins that can form homophilic and heterophilic interactions in conjunction with an array of binding partners that participate in a diverse range of processes, including cell adhesion, signalling and immunoregulation 45 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. Interestingly, CD177 levels have previously been associated with COVID-19 severity and ICU admission in an analysis of a French cohort by whole-blood transcriptomics and serum profiling 57 . The authors proposed that increased circulating CD177 reflects the dysregulated neutrophil activation observed in severe COVID-19. We also observed an equivalent progressive and significant upregulation of CD177 in PBMC preparations. Isolation of PBMCs by density gradient centrifugation typically excludes granulocytes and an absence of mature neutrophil contamination was verified by flow cytometry.
Phenotyping indicated the progressive emergence of a CD3 + CD4 + CD177 + T-cell population as disease severity increased in the context of a broader depletion of both CD4 + and CD8 + T-cells. This population also co-upregulated CEACAMs 1 and 6, CD63, CD89 and likely CEACAM8, although the latter marker exhibited substantial inter-donor variation.
Simultaneous expression of CEACAMs 6 and 8, CD177 and CD89 on CD4 + T-cells is intriguing, as expression of these markers is regarded as restricted to granulocytes and monocytes and represents a unique phenotype that has not been previously reported. CD177 expression may plausibly facilitate migration into critical tissues and cell activation in the context of infection. Upregulation of CD89 is particularly interesting due to its role in bacterial sepsis 44   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; a CEACAM8 + subpopulation 17 , although this population exhibited low to intermediate CD16 expression 15 . Prior characterisation of PBMC from this cohort by single-cell RNAseq documented the appearance of a rare C1QA/B/C + CD16 + monocyte population in patients with severe disease that is predicted to interact with and contribute to platelet activation 26 , but did not observe emergence of any mature or immature neutrophil population. Our data may therefore suggest the appearance of a CD16 + CEACAM1/6/8 + monocyte subset in the context of severe COVID-19. 26 . Taken  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Lead Contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Prof. Michael Weekes (mpw1001@cam.ac.uk).

Materials availability
This study did not generate new unique reagents.

Data and code availability
The mass spectrometry raw files and associated unmodified peptide and protein quantitation data have been deposited in the ProteomeXchange Consortium via the iProX partner repository with identifier IPX0005417000 76,77 .

Experimental Model and Subject Details
Human subjects . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022.  Table S1.

Clinical data collection
Clinical data were collected from medical charts and entered into spreadsheets. Available laboratory test results were extracted from Epic electronic health records (Cambridge University Hospitals) and from MetaVision ICU (Royal Papworth hospital ITU). SARS-CoV-2-positive HCW participants were categorised into two groups according to whether they were asymptomatic (group A) or had COVID-19 symptoms at the time of PCR testing (group B). Symptoms considered to be possible manifestations of COVID-19 were new onset fever (>37.8 °C), cough, loss of sense of small, hoarseness, nasal discharge or congestion, shortness of breath, wheeze, headache, muscle aches, nausea, vomiting and diarrhoea 28 .
Hospital patients were assigned to one of three groups, reflecting the maximum level of respiratory support received during their hospital stay. Group C patients did not receive any supplemental oxygen.
Group D patients received supplemental oxygen using low flow nasal prongs, a simple face mask, a Venturi mask or a non re-breathe face mask. Patients who received any of non-invasive ventilation (NIV), mechanical ventilation or extracorporeal membrane oxygenation (ECMO) were assigned to group E. Patients in group D who died in hospital during the study were also assigned to group E. In patients who were already established on home NIV for chronic respiratory failure, NIV delivered as . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; per the home prescription (e.g. nocturnal) was not considered for the purpose of classification.
Moreover, oxygen requirements that were clearly not related to COVID-19 were also not considered for classification purposes.

Peripheral blood mononuclear cell preparation and flow immunophenotyping
Each participant provided 27 mL of peripheral venous blood collected into 9 mL sodium citrate tubes.
Peripheral blood mononuclear cells (PBMCs) were isolated using Leucosep tubes (Greiner Bio-One) with Histopaque 1077 (Sigma) by centrifugation at 800 × g for 15 minutes at room temperature.
PBMCs at the interface were collected, rinsed twice with autoMACS running buffer (Miltenyi Biotech) and cryopreserved in FBS with 10% DMSO. All samples were processed within 4 hours of venepuncture.
For proteomic and flow cytometry analysis, frozen PBMC samples were thawed in a water bath at 37 o C and immediately diluted in TexMACS media (Miltenyi Biotech), centrifuged, resuspended in TexMACS supplemented with 10U/ml DNAse (Benzonase, Merck-Millipore) and rested at 37°C for 1h. PBMCs were then centrifuged, resuspended in fresh media and counted. For proteomic analysis, PMBCs were washed twice in ice-cold PBS pH 7.4 (Sigma). 75% of cells were used for plasma membrane profiling, while the remaining 25% was used for whole cell lysate proteomic analysis. For flow cytometry, cells were directly processed as described below.

Plasma membrane profiling
Following washing, cellular surface sialic acid residues were oxidised then biotinylated with 1 mM sodium meta-periodate (Thermo), 100mM aminooxy-biotin (Biotium) and 10 mM aniline (Sigma) in ice cold PBS pH 6.7, by rocking the cells with 3 mL of the mix at 4 °C for 30 minutes. The reaction was quenched by adding glycerol (Sigma) to a final concentration of 1 mM. The cells were then washed twice with ice-cold PBS pH 7.4 containing CaCl2 and MgCl2, and then lysed with 1.6% Triton X-100 (Thermo), 150 mM NaCl (Sigma), 1 × protease inhibitor (complete, without EDTA (Roche)), 5 mM iodoacetamide (Sigma) and 10 mM Tris-HCl pH 7.6 (Sigma) for 30 minutes at 4 °C. Nuclei and debris were removed by centrifugation at 4 °C, once at 4,000 × g for 5 minutes then twice at 13,000 × g for 5 minutes. Samples were then snap-frozen in liquid nitrogen and stored at -80 °C prior to immunoprecipitation and protein digestion. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint (Invitrogen). Beads were next incubated with PBS / 0.5% SDS / 100 mM dithiothreitol (DTT) for 20 minutes at room temperature. Further washes were performed with UC buffer (6 M urea in 0.1 M Tris-HCl pH 7.6), before alkylation with UC buffer containing 50 mM iodoacetamide for 20 minutes at room temperature in the dark. Beads were washed again with UC buffer and HPLC-grade H2O, transferred to a screw cap column (Pierce) and then proteins were digested on-bead with 35 µl of 8 ng/µl trypsin (Thermo) in 200 mM HEPES pH 8.5 (Sigma) for 3 h in a shaking 37 °C incubator. The digested peptides were eluted and stored at -80 °C before TMT labelling.

Whole cell lysate digestion
After washing cells were lysed in 50 µl of 6 M guanidine (Thermo) / 50 mM HEPES pH 8.5, vortexed extensively and sonicated. Cell debris was removed by centrifuging twice at 13,000 × g for 10 minutes at 4 °C.
DTT was added to a final concentration of 5 mM and incubated at room temperature for 20 mins.
Cysteine residues were alkylated with 15 mM iodoacetamide and incubated for 20 min at room temperature in the dark. Excess iodoacetamide was quenched with DTT for 15 mins. Samples were diluted with 200 mM HEPES pH 8.5 to a final concentration of 1.5 M guanidine, followed by digestion at room temperature for 3 h with LysC protease (Wako) at a 1:100 protease-to-protein ratio. Samples were further diluted with 200 mM HEPES pH 8.5 to a concentration of 0.5 M guanidine. Trypsin was then added at a 1:100 protease-to-protein ratio followed by overnight incubation at 37°C. The reaction was quenched with 5% formic acid and centrifuged at 21,000 × g for 10 min to remove undigested protein. Peptides were subjected to C18 solid-phase extraction (SPE, Sep-Pak, Waters) and vacuumcentrifuged to near-dryness. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Offline HpRp fractionation
TMT-labelled tryptic peptides derived from WCL samples were fractionated using an Ultimate 3000 RSLC UHPLC system (Thermo) equipped with a 2.1 mm internal diameter (ID) x 25 cm long, 1.

LC-MS3
Mass spectrometry data were acquired using an Orbitrap Lumos (Thermo Fisher Scientific, San Jose, CA). An Ultimate 3000 RSLC nano UHPLC equipped with a 300 µm ID x 5 mm Acclaim PepMap µ-Precolumn (Thermo) and a 75 µm ID x 50 cm 2.1 µm particle Acclaim PepMap RSLC analytical column was used. Loading solvent was 0.1% formic acid (FA), analytical solvent A: 0.1% FA and B: 80% MeCN + 0.1% FA. All separations were carried out at 55°C. Samples were loaded at 5 µl/min for 5 min in loading solvent before beginning the analytical gradient. The following gradient was used: 3-7% B over 3 min, 7-37% B over 173 min, followed by a 4 min wash at 95% B and equilibration at 3% B for 15 min.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022.  Table) for 30 minutes at 4°C, before washing in an excess of PBS, centrifugation and resuspension in 100µl FluoroFix Buffer (BioLegend). Single colour compensation controls were prepared for the panel using AbC Compensation beads (Thermo) and ArC Compensation beads (Thermo) for antibody stains and amine-reactive viability stains respectively, or healthy control PBMCs. Samples were then analysed on a Cytek Aurora flow cytometer (Cytek Biosciences) and data analysed in FlowJo (BD).

Data Analysis
Mass spectra were processed using a Sequest-based software pipeline for quantitative proteomics, ''MassPike'', through a collaborative arrangement with Professor Steven Gygi's laboratory at Harvard Medical School. MS spectra were converted to mzXML using an extractor built upon Thermo Fisher's RAW File Reader library (version 4.0.26). In this extractor, the standard mzxml format has been augmented with additional custom fields that are specific to ion trap and Orbitrap mass spectrometry and essential for TMT quantitation. These additional fields include ion injection times for each scan, is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint tolerance. Fragment ion tolerance was set to 1.0 Th. TMT tags on lysine residues and peptide N termini and carbamidomethylation of cysteine residues (57.02146 Da) were set as static modifications, while oxidation of methionine residues (15.99492 Da) was set as a variable modification.
To control the fraction of erroneous protein identifications, a target-decoy strategy was employed 80 .
Peptide spectral matches (PSMs) were filtered to an initial peptide-level false discovery rate (FDR) of 1% with subsequent filtering to attain a final protein-level FDR of 1%. PSM filtering was performed using a linear discriminant analysis, as described previously 80 . This distinguishes correct from incorrect peptide IDs in a manner analogous to the widely used Percolator algorithm (https://noble.gs.washington.edu/proj/percolator/), though employing a distinct machine-learning algorithm. The following parameters were considered: XCorr, DCn, missed cleavages, peptide length, charge state, and precursor mass accuracy. Protein assembly was guided by principles of parsimony to produce the smallest set of proteins necessary to account for all observed peptides (algorithm described in 80 ).
Proteins were quantified by summing TMT reporter ion counts across all matching peptide-spectral matches using ''MassPike'', as described previously 81 . Briefly, a 0.003 Th window around the theoretical m/z of each reporter ion was scanned for ions and the maximum intensity nearest to the theoretical m/z was used. The primary determinant of quantitation quality is the number of TMT reporter ions detected in each MS3 spectrum, which is directly proportional to the signal-to-noise (S:N) ratio observed for each ion. Conservatively, every individual peptide used for quantitation was required to contribute sufficient TMT reporter ions so that each on its own could be expected to provide a representative picture of relative protein abundance 81 . A per-sample S:N ratio of >15 was required such that, for example, for a 16-plex experiment a combined S:N ratio of >240 across all TMT reporter ions would be needed for a peptide to pass filtering. An isolation specificity filter with a cutoff of 50% was additionally employed to minimise peptide co-isolation 81 . Peptides meeting the stated criteria for reliable quantitation were then summed by parent protein, in effect weighting the contributions of individual peptides to the total protein signal based on their individual TMT reporter ion yields. Protein quantitation values were exported for further analysis in Excel.
For protein quantitation, reverse and contaminant proteins were removed, then each reporter ion channel was summed across all quantified proteins and normalised assuming equal protein loading across all channels. Missing values in the mass spectrometry data were imputed for a small number of donor-protein datapoints (11 datapoints in total for all WCL analyses, two datapoints in total for both PM analyses) by setting missing values to the minimum intensity observed for the protein within each multiplexed experiment. Data for all HLA isoforms were removed, due to variation in HLA allele . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; expression between donors. Proteins originating from red blood cell (RBC) contaminants were removed if they met two criteria: (1) they were identified as one of the top 15% most abundant RBC proteins from Ravenhill et al 82 and (2) they had a coefficient of variation greater than 0.75 in any given disease severity class when comparing protein abundance fold-change versus healthy control across donors.
Hierarchical centroid clustering of proteomic data was carried out in Cluster 3.0 (Stanford University) using an uncentred Pearson correlation similarity metric.

Statistical Analysis
For displayed proteins or cell populations, ordinary one-way ANOVA tests with Tukey's adjustment for multiple comparisons were carried out in GraphPad Prism 9 on log2-transformed fold-change values for each donor versus healthy controls (Figs. 2A, 3A, 4C) or on untransformed cell population proportion values (Fig. 4A, 4B, S3A). Adjusted p-values <0.05 were considered significant. Table S1: one-and three-way ANOVAs with Tukey's multiple comparisons post-hoc test for WCL MS data or Kruskal-Wallis tests for PM MS data were carried out in Perseus.

Pathway Analysis
The Database for Annotation, Visualization and Integrated Discovery (DAVID) was used to identify enrichment of pathways in upregulated gene clusters as specified in the text. In each case, the cluster was searched against the background of all proteins quantified in our proteomics data using default settings.

Acknowledgments
We are grateful to Prof. Steve Gygi for providing access to the "MassPike" software pipeline for quantitative proteomics. We thank NIHR BioResource volunteers for their participation, and gratefully acknowledge NIHR BioResource centres, NHS Trusts and staff for their contribution. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint

Supplemental tables
Table S1 (A) Details of donors used for proteomic analysis (B) Details of donors used to generate whole-blood RNA-seq data in Bergamaschi et al 5   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted November 18, 2022. ; https://doi.org/10.1101/2022.11.16.22282338 doi: medRxiv preprint