Abstract
Severe coronavirus disease 2019 (COVID-19) has been associated with certain preexisting health conditions and can cause respiratory failure along with other multi-organ injuries. However, the mechanism of these relationships is unclear, and prognostic biomarkers for the disease and its systemic complications are lacking. This study aims to examine the plasma protein profile of COVID-19 patients and evaluate overlapping protein modules with biomarkers of common comorbidities.
Blood samples were collected from COVID-19 cases (n=307) and negative controls (n=78) among patients with acute respiratory distress. Proteins were measured by proximity extension assay utilizing next-generation sequencing technology. Its associations to COVID-19 disease characteristics were compared to that of preexisting conditions and established biomarkers for myocardial infarction (MI), stroke, hypertension, diabetes, smoking, and chronic kidney disease.
Several proteins were differentially expressed in COVID-19, including multiple pro-inflammatory cytokines such as IFN-γ, CXCL10, and CCL7/MCP-3. Elevated IL-6 was associated with increased severity, while baseline IL1RL1/ST2 levels were associated with a worse prognosis. Network analysis identified several protein modules associated with COVID-19 disease characteristics overlapping with processes of preexisting hypertension and impaired kidney function. BNP and NTpro-BNP, markers for MI and stroke, increased with disease progression and were positively associated with severity. MMP12 was similarly elevated and has been previously linked to smoking and inflammation in emphysema, along with increased cardiovascular disease risk.
In conclusion, this study provides an overview of the systemic effects of COVID-19 and candidate biomarkers for clinical assessment of disease progression and the risk of systemic complications.
Introduction
The recent coronavirus (COVID-19) pandemic is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV2), and since its identification in late 2019, the virus has spread worldwide with over 30 million reported cases1. Clinical presentation can range from mild flu-like symptoms, including fever, cough, and shortness of breath to acute respiratory distress syndrome (ARDS) requiring mechanical ventilation2-4. Disease severity and mortality rate have been associated with certain preexisting conditions such as a history of cardiovascular disease, hypertension, diabetes, and obesity, which corresponds with worse prognosis4-7. These conditions often accompany older age and likely explain the higher mortality rate observed among the elderly population4. Recent genetic studies indicate the potential protective effect of specific blood antigens and possibly polymorphisms within the ACE2 gene8,9, the primary cell entry receptor for SARS-CoV210.
Although clinical manifestations are mainly respiratory, early clinical reports and extrapolation from similar coronaviruses11 (e.g., SARS-CoV1 and MERS-CoV) have detailed the systemic effects of COVID-19, including acute cardiac injury, heart failure, arrhythmia, gastrointestinal distress, impaired liver function, and acute kidney injury4,7,11-13. Thromboembolic complications are common among patients with preexisting cardiac and cerebrovascular diseases, which is likely related to the systemic inflammation and pro-coagulatory conditions from COVID-19 infection14,15. Early surveillance studies have also reported neurological manifestations, including altered mental status and impaired consciousness along with fatigue, pain, and sensory disturbances (e.g., anosmia, dysgeusia) post-recovery16,17; however, the long-term complications of COVID-19 remain unclear.
While the clinical characteristics of COVID-19 are continually refined in real-time, more efficient tools, particularly prognostic markers, are needed to evaluate disease progression for targeted intervention strategies and better understand the overlapping systemic pathology between SARS-CoV2 infection and comorbidities. Using high-sensitivity proximity extension technology, this study examines the blood proteome of COVID-19 patients for protein markers associated with early infection and disease prognosis and compares with known biomarkers of common preexisting conditions and related complications.
Methods
Adult patients (n=384) presenting with acute respiratory distress were investigated at the Massachusetts General Hospital (Boston, USA), of which 306 patients tested positive for COVID-19 while 78 remained as negative controls18. The descriptive statistics of the cohort are provided in Table 1. Longitudinal blood sampling was conducted for cases at 3, 7, and 28 days from baseline, if possible. Preexisting conditions including heart (e.g., coronary artery disease, congestive heart failure, valvular disease), lung (e.g., asthma, chronic obstructive pulmonary disease, regular O2 use), and kidney (e.g., chronic kidney disease, baseline creatinine >1.5) disease were recorded along with any history of diabetes, hypertension, and any immunocompromising conditions. Characterization of “obese” was defined as a body-mass index (BMI) of ≥30 kg/m2. The patient’s condition was assessed using a 6-point ordinal scale (1=death; 2=intubation and mechanical ventilation; 3= non-invasive ventilation or high-flow oxygen; 4=hospitalized with supplementary oxygen; 5=hospitalized without supplementary oxygen; 6=not hospitalized) based on World Health Organization guidelines19.
Proteins were measured in plasma using a proximity extension assay (PEA), a high-sensitivity multiplex immunoassay that utilizes paired oligonucleotide antibody probes for protein identification followed by quantification using qPCR or next-generation sequencing (NGS) technology20. In this study, samples were analyzed with the NGS-based Olink Explore product consisting of four 384-plex panels of 1536 assay probes, including 48 controls and three inter-panel quality control markers (IL-6, IL-8, and TNF)21. The relative concentration for each protein was quantified as log base-two normalized protein expression (NPX) levels. Internal assay controls were used to quality check each step of the assay (i.e., incubation, extension, and amplification), and sample measures with high variability were excluded. Additional details regarding the method have been described elsewhere20,21.
In summary, 1420 unique proteins were analyzed with the majority having a call-rate (i.e., measurable levels above the limit of detection) above >80% (Supplementary Table S1). Measures below 25% (x=160) were excluded, while those between 25-80% (x=229) remained in the analysis but were interpreted with precaution. In total, 1260 proteins passed quality control for the analysis.
Differences in protein levels between COVID-19 positive patients and negative controls were analyzed using a multivariable linear regression model adjusting for age and preexisting conditions (Table 1). Longitudinal changes in protein concentrations were analyzed using a paired Student t-test. Preanalytical variation associated with sample quality was assessed and corrected using previously defined markers of sample handling (e.g., CD40L)22. Association with severity and prognosis was assessed using the baseline severity score or maximum-reached severity score within the 28 day period, respectively. Scores for severity and prognosis were dichotomized based on the usage of mechanical ventilation or death (severe: 1-2, non-severe, 3-6; WHO score). Significance after Bonferroni correction for multiple testing was set at P<10−5. All statistical analyses were conducted using R v.4.0.2 (Vienna, Austria).
Modules or clusters of proteins were identified by weighted correlation network analysis (WGCNA) using a Pearson-based weighted adjacency matrix (signed, β=14) and average linkage hierarchical clustering23. Modules were evaluated for enriched biological processes24 and used to compare study associations with the differential protein profiles of other diseases including myocardial infarction (MI)25-27, cardiovascular-related death (CVD) / heart failure (HF)27, stroke27,28, hypertension29, atherosclerosis30, diabetes31, smoking32, kidney function33, and chronic kidney disease (CKD)33.
Results
Many proteins were differentially expressed among COVID-19 positive patients compared to negative controls, as shown in Figure 1. Inflammatory cytokines such as CXCL10, CXCL11, and IFN-γ increased four-fold at initial sampling but then decreased during the follow-up period. In contrast, lower levels were detected for CDON, ROR1, and BOC but likewise stabilized and regressed within the first week. However, several proteins, including ITGA11, continued to decrease over time. A delayed effect was observed with SDC1, PTN, and SFRP1, which were not associated initially but increased gradually as the disease progressed. Similarly, levels of ACE2 increased over two folds within the first week but was not elevated at baseline (β= −0.06, P= 0.65).
Findings were not significantly affected after correcting for sample handling, although there was a minor improvement in the variability (Supplementary Table S2, S3). The majority remained significant after correcting for preexisting conditions and obesity. However, a few proteins, including IL1R2, a protein upregulated in the adipose tissue of obese patients34, were only associated after stratification by BMI (normal: β<0.001, P=0.99; obese β=-0.54, P=3.3×10−7; Supplementary Table S4).
Biomarkers associated with disease severity, as defined by mechanical ventilator use, are illustrated in Figure 2 and compared to the disease-associated markers identified in Figure 1. Overlapping markers, including EZR, NADK, and KRT19, may be useful biomarkers for diagnosing infection and monitoring disease severity. IL-6, which was only slightly elevated among cases (β=0.60, P=0.02), was significantly correlated with severity. However, the majority of proteins associated with severity were different from those associated with the disease. Measures such as DDAH1 and NPM1 were also associated with severity among COVID-19 negative controls indicating a lack of specificity for specific proteins.
Baseline levels of DCN, S100P, and the cardiac biomarker IL1RL1/ST2 were associated with maximum-observed severity (i.e., death or requiring ventilation) within the 28 days, even after correcting for baseline severity (Supplementary Table S7). Plasma levels of S100P were lower among cases compared to controls (β=-0.48, P=0.0004), but levels of DCN and IL1RL1 were not effected at baseline. However, IL1RL1 was associated with baseline severity (β=0.79, P=4.7×10−8). Chemokine CXCL10 was also correlated with baseline severity and prognosis (β=0.63, P=0.0003; β=0.90, P=0.0006).
As most cases were over 50 years old, many have preexisting conditions such as diabetes and hypertension (Table 1). COVID-19 positive patients were more likely to be obese but had unexpectedly lower rates of lung disease and diabetes, although this may be due to selection bias among controls. Association between baseline protein levels and preexisting conditions among cases was examined in Figure 3. Leptin (LEP), a metabolism-regulating hormone primarily secreted by adipose tissue34, was the primary protein associated with obesity but was only slightly increased among cases (β=0.681, P=0.0007). Cardiac biomarkers, NT-proBNP, and its active form NPPB/BNP, were negatively associated with obesity but positively associated with preexisting heart disease. Both were under-expressed in COVID-19 at baseline (β<-1.19, P<2×10−7) but increased significantly during follow up (β>1.1, P<3×10−5). BMI significantly modified baseline association for both cardiac biomarkers (normal: β>-0.71, P>0.20; obese: β<-1.82, P<2×10-8). FGF19 was elevated in patients with hypertension and possibly those with preexisting kidney disease (unadjusted: β=0.73, P=2×10−5; adjusted: β=0.77, P=0.08).
To determine if specific biological processes overlap between COVID-19 disease and related comorbidities, modules of intercorrelated proteins were identified and used to cross-examine associations between disease characteristics, preexisting conditions, and relevant biomarkers established in previous studies. As illustrated in Figure 4, several identified modules correspond to over- (green) and under- (turquoise/black) expressed proteins among COVID-19 patients. As previously illustrated, these tend to be different from proteins associated with disease progression (red/turquoise) and severity (yellow). There was a significant overlap in markers associated with preexisting kidney disease and hypertension (turquoise), which were responsive to COVID-19 infection and increased with disease progression. Many have been previously established as markers of reduced kidney function, based on estimated glomerular filtration rate (eGFR), and increased risk of developing chronic kidney disease.
NPPB and NT-proBNP (left-end turquoise) were associated with myocardial infarction, heart failure, and ischemic stroke. Its moderately correlated protein, MMP12, was similarly elevated in both myocardial infarction and stroke. MMP12 was also higher among smokers, being associated with inflammation in emphysema; therefore, it may be a shared proinflammatory mediator between COVID-19 and severity-related comorbidities. The protein module associated with disease severity (yellow) overlapped with biomarkers for increased myocardial infarction risk such as AGER, CTSL, PARP1, and SOD2. Many severity-related measures were also associated with preexisting lung disease, although in the opposite direction. Proteins of the last module (black), which includes ITGA11, continued to decrease throughout the 28 day observation period. Considering its moderate-overlap with obesity and diabetes, the module may be related to long-term metabolic dysfunctions.
Discussion
This study provides a broad overview of the systemic effects of COVID-19 measured through blood, a natural sink for multiple organ systems and an easily accessible medium for clinical investigation. Findings indicate a significant disruption in the circulating proteome of infected patients, impacting multiple biological processes relating to pulmonary inflammation along with cardiac injury and renal dysfunction. These findings support the observation of multi-organ complications reported in previous clinical studies4,7,36.
As expected, many pro-inflammatory cytokines were elevated during the early stages of the disease and may influence overall severity, as shown in the previous studies3,4. The hyperactive immune response against SARS-CoV2 infection, often referred to as a cytokine storm, has been hypothesized to be a cause of disease mortality37,38. IFN-γ-induced chemoattractant CXCL10 was one of the primary cytokines elevated in cases and was a suggestive marker for disease severity. Previous studies have also shown increased levels of CXCL10 along with CCL7 (MCP-3) in COVID-19, and CXCL10 was a suggested biomarker for ARDS with protein DDX5839,40. As IL-6 blockade has been effective for managing cytokine release syndrome, IL-6 has also been investigated as a therapeutic target for COVID-1937. Although not notably elevated among cases in this study, higher levels of IL6 were associated with increased severity. Findings further support the potential benefits of such treatment. As IL-6 is also a marker with myocardial infarction and smoking, treatment efficacy may be modulated by active cigarette smoking or preexisting cardiovascular disease targeting related disease complications25,27.
Several cardiac biomarkers were influenced by COVID-19, including the hormone brain natriuretic peptide (BNP/NPPB) and its inactivated-form, N-terminal pro–BNP (NT-proBNP). Both are known predictors of acute cardiac injury and heart failure27,41 and have been proposed as measures for increased mortality among COVID-19 patients42. A retrospective examination of deceased COVID-19 patients has also shown elevated levels of circulating NT-proBNP during hospitalization4. Therefore, BNP-related measures may be useful for monitoring cardiac stress and the risk of thromboembolic complications, particularly among those with obese or preexisting heart conditions. Another biomarker, IL1RL1, also known as ST2, is associated with cardiac remodeling, and soluble ST2 has been a marker of acute myocardial infarction43-45. Elevated levels of ST2 was associated with increased mortality rate and may be an additional complementing marker of cardiac complications for COVID-1946.
Other proteins may also indicate cardiac-related injury, including CDON, which along with its coexpressed partner BOC, were lower among cases. CDON deficiency in mice has been associated with cardiac remodeling and fibrosis through hyperactivation of the Wnt/β-catenin pathway and may indicate an increased risk of cardiac injury and heart failure in patients47. CDON levels were also negatively correlated with severity and, to a lesser extent, disease prognosis. On the other hand, MMP12, an increased measure associated of cigarette smoking, maybe a mediator of pulmonary inflammation32,48, in COVID-19 and preexisting lung conditions, and cardiovascular-related comorbidities.
As the primary entry receptor for SARS-CoV2, angiotensin-converting enzyme 2 (ACE2), an essential moderator of blood pressure, has often been examined in the relationship between COVID-19 and cardiovascular complications. Studies have suggested that coronavirus infection can affect the expression of ACE2 pathways in the heart and increase cardiac complication risk associated with localized inflammation49. ACE inhibitors frequently used to manage hypertension have been associated with increased risk of kidney injury among COVID-19 patients50. Furthermore, increased levels of soluble ACE2, as seen among cases in this study, may offer some protection against SARS-CoV2 infection by inhibiting receptor-binding activity, although this requires further clinical investigation51.
Studies have also shown ACE2 expression in kidneys, which may indicate a direct relationship between SARS-CoV2 infection and renal complications52. Systematic release of pro-inflammatory cytokines in ARDS may increase the risk of acute kidney injury, while its resulting accumulation from reduced renal function may, in turn, exacerbate ARDS53. In this study, several proteins were associated with both COVID-19 and kidney dysfunction, and maybe possible biomarkers for monitoring acute injury, particularly among those with hypertension.
The long-term complication of COVID-19 could not be directly assessed in this study due to the limited follow-up time. However, previous studies have reported lasting changes in metabolic and sensory functions, particularly among severe cases17,35. Although levels of proteins like ILTGA11 seemed to show long-term disruptions from disease, proper longitudinal studies will be required to investigate its trajectory. Although we focused on circulating proteins in this study, other media such as sputum may provide more localized measures for COVID-1948. Similarly, urine and cerebrospinal fluid may be better for assessing renal and neurological complications. However, blood remains the single most comprehensive source for assessing systemic effects, and the sensitivity of proximity extension technology allows the detection of trace proteins from multiple organ systems. Unfortunately, previous studies have been limited to targeting specific aspects of the blood proteome, which significantly limits the resolution for comparing overlapping biological processes. However, the recent incorporation of NGS readout in PEA will likely provide a more comprehensive disease proteomic profile in future studies.
In summary, this study provides an initial assessment of the overlapping biological processes associated with COVID-19 and related comorbidities. Cardiac biomarkers NTpro-BNP, BNP, and ST2, may be useful for monitoring and assessing the risk of cardiac and cerebrovascular complications while specific measures of inflammation such as CXCL10, IL6, and MMP12 may be useful for identifying patient groups responsive to immunosuppressive treatments. However, further investigations are required to validate the efficacy of these potential biomarkers in clinical settings.
Data Availability
Study data is available for request as of October 2, 2020.
Conflicts of interest
JH has no conflicts of interest to disclose.
Acknowledgments
The author would like to thank the principal investigators of the study (Michael R. Filbin, Alexandra-Chloe Villani, Nir Hacohen, Marcia Goldberg) for providing the accessible clinical and proteomic data which was processed in collaboration with Olink. Additional thanks to the collection team (Kyle Kays, Kendall Lavin-Parsons, lair Parry, Brendan Lilley, Carl Lodenstein, Brenna McKaig, Nicole Charland, Hargun Khanna, Justin Margolin); processing team (Moshe Sade-Feldman, Anna Gonye, Irena Gushterova, Tom Lasalle, Nihaarika Sharma, Brian C. Russo, Maricarmen Rojas-Lopez, Kasidet Manakongtreecheep, Jessica Tantivit, Molly Fisher Thomas); and those involved with data preprocessing (Arnav Mehta, Alexis Schneider)18.
Abbreviations
- ACE2
- Angiotensin-converting enzyme 2
- ARDS
- Acute respiratory distress syndrome
- BMI
- Body mass index
- CVD
- Cardiovascular-related death
- CKD
- Chronic kidney disease
- COVID-19
- Coronavirus disease 2019
- eGFR
- Estimated glomerular filtration rate
- HF
- Heart failure
- IL-6
- Interleukin-6
- IL1RL1/ST2
- Interleukin 1 receptor-like 1
- LEP
- Leptin
- MERS
- Middle Eastern respiratory syndrome
- MI
- Myocardial infarction
- NGS
- Next-generation sequencing
- NPPB/BNP
- B-type natriuretic peptide
- NPX
- Normalized protein expression
- NT-proBNP
- N-terminal pro–B-type natriuretic peptide
- PEA
- Proximity extension assay
- SARS
- Severe acute respiratory syndrome
- SARS-CoV2
- Severe acute respiratory syndrome coronavirus 2
- WGCNA
- Weighted gene correlation network analysis