Structured Abstract
Importance Post-acute sequelae of SARS-CoV-2 infection (PASC) is emerging as a major public health issue.
Objective We characterized the incidence of PASC, or related symptoms and diagnoses, for COVID-19 and influenza patients.
Design Retrospective cohort study.
Setting Our data sources were the IBM MarketScan Commercial Claims and Encounters (CCAE), Optum Electronic Health Record (EHR) and Columbia University Irving Medical Center (CUIMC) databases that were transformed to the Observational Medical Outcome Partnership (OMOP) Common Data Model (CDM) and were part of the Observational Health Sciences and Informatics (OHDSI) network.
Participants The COVID-19 cohort consisted of patients with a diagnosis of COVID-19 or positive lab test of SARS-CoV-2 after January 1st 2020 with a follow up period of at least 30 days. The influenza cohort consisted of patients with a diagnosis of influenza between October 1, 2018 and May 1, 2019 with a follow up period of at least 30 days.
Intervention Infection with COVID-19 or influenza.
Main Outcomes and Measures Post-acute sequelae of SARS-CoV-2 infection (PASC), or related diagnoses, for COVID-19 and influenza patients.
Results In aggregate, we characterized the post-acute experience for over 440,000 patients who were diagnosed with COVID-19 or tested positive for SARS-COV-2. The long term sequelae that had a higher incidence in the COVID-19 compared to Influenza cohorts were altered smell or taste, myocarditis, acute kidney injury, dyspnea and alopecia. Additionally, the long term incidences of respiratory illness, musculoskeletal disease, and psychiatric disorders for the COVID-19 population were higher than expected.
Conclusions and Relevance The long term sequelae of COVID-19 and influenza may be different. Further characterization of PASC on large scale observational healthcare databases is warranted.
Background
The global pandemic of SARS-CoV-2 infection and impact of COVID-19 disease has resulted in major morbidity and mortality worldwide. While substantial research has sought to characterize the disease natural history and the acute management of COVID-19, comparatively fewer studies have focused on the post-acute sequelae of SARS-CoV-2 infection (PASC)1-6. Much of the publicly available data on PASC come from case reports and single-institution prospective cohort studies7-13. Patient reported symptom data have shown that prolonged fatigue, headache, dyspnea and anosmia are PASC symptoms14. However, there are fewer claims and electronic health record (EHR) data about the incidence of PASC symptoms. Therefore, the public knowledge about PASC symptom presentation is evolving. The National Institutes of Health has encouraged the study of PASC on large-scale observational databases in order to better understand the condition and its public health impact15. In alignment with that effort, we present data on PASC patients from EHR and claims databases with an aim to characterize the natural history of patients with SARS-CoV-2 who developed symptoms or diagnoses related to PASC. Additionally, we characterized related long-term sequelae of influenza to provide a comparison.
Methods
Three observational health databases were used for the analysis: a private-payer administrative claims database (IBM MarketScan Commercial Claims and Encounters-CCAE), a database of inpatient and outpatient electronic health records (Optum© de-identified Electronic Health Record Dataset-Optum EHR), and an electronic health record system from an academic medical center (Columbia University Irving Medical Center-CUIMC). All databases were transformed to the Observational Medical Outcome Partnership (OMOP) Common Data Model (CDM), and described further in Supplementary Appendix#1.
We defined a COVID-19 cohort as patients with a diagnosis of COVID-19 or positive lab test of SARS-CoV-2 after January 1st 2020 with a follow up period of at least 30 days. An influenza cohort was defined as patients with a diagnosis of influenza between October 1, 2018 and May 1, 2019 with a follow up period of at least 30 days. We performed characterizations of COVID-19 and influenza patients who had at least one of the following symptoms or diagnoses that were related to Post-Acute Sequelae of SARS-CoV-2 Infection (PASC), according to the Centers for Disease Control and Prevention (CDC)15: altered smell or taste, myocarditis, acute kidney injury, dyspnea, alopecia, tachycardia, chest pain, lung disorder, myalgias, dementia or cognitive Impairment, malaise, fatigue, stress disorder, depression, anxiety, joint pain, mood changes, cough, rash, and fever. We calculated the number of patients who had each and any of these diagnoses between 30 and 180 days after the index event.
Diagnoses were based on ICD-10-CM and SNOMED codes, while lab tests included LOINC codes. The full list of codes for the symptoms and diagnoses (Supplementary Appendix #2) and our calculation of relative risk (Supplementary Appendix#3) are provided in the supplementary information.
Results
In aggregate, we characterized the post-acute experience for over 440,000 patients who were diagnosed with COVID-19 or tested positive for SARS-COV-2. We identified 119,510 patients with COVID-19 in the Optum EHR database. Of those, 42,991 (36.28%) had at least one long term sequela. In the IBM MarketScan CCAE database, we identified a total of 306,142 patients with COVID-19, 74,320 (24.28%) of whom had at least one long term sequela. In the CUIMC database, 6,198 (27.52%) of 22,524 patients of patients with COVID-19 had a PASC-related observation within the following 6 months.
Table 1 shows the number of patients from the COVID-19 and influenza cohorts in each database, and subgroups of patients with all or any PASC diagnoses. Five PASC diagnoses had higher relative risk in COVID-19 compared to Influenza patients: altered smell or taste, myocarditis, acute kidney injury, dyspnea and alopecia.
Additionally, the proportions of patients who had post-acute diagnoses or symptoms of lung disorder, chest pain, depression, anxiety or joint pain in the COVID-19 cohort were greater than 2% in each database.
Discussion
We have reported one of the largest studies about the incidence of PASC, and related symptoms or conditions, in a cohort of patients infected with COVID-19 using claims and EHR data. The incidences of some outcomes were greater for the COVID-19 cohort compared to the influenza cohort. Therefore, long term sequelae of COVID-19 may be different from influenza. For example, we expect that the long-term anosmia prevalence will likely be greater for COVID-19 patients.
Additionally, we observed symptoms and diagnoses in the COVID-19 cohort at non-negligible rates. Given the global prevalence of the COVID-19 pandemic, those conditions could have a major public health impact. Specifically, the global burden of respiratory illness, musculoskeletal disease, and psychiatric disorders may increase as a consequence of the COVID-19 pandemic. These findings suggest that PASC is likely composed as a heterogeneous constellation of continuing symptoms that do not resolve, rare but unusual symptoms, and prevalent serious symptoms.
Alternative approaches to characterizing PASC have used self-reported symptom data. Although those approaches are comprehensive, self-reported data may be different from patient assessments by healthcare providers. We expect that the differences in these kinds of data may help explain why the incidences of outcomes in our study are lower than incidences reported in patient self-assessment studies9,10,13.
A limitation of our analysis is that we did not validate our phenotypes, and therefore measurement error is possible; however, we used the CDC description for PASC in developing our phenotype15. Also, patient attrition to primary care sites out of network may have contributed to bias in the EHR database.
We have demonstrated the feasibility of characterizing the natural history of post-acute COVID-19 infections on both EHR and claims databases. The implications of our analysis may lead to public health interventions that can reduce the global burden of long-term sequelae from the COVID-19 pandemic.
Conclusions
We have presented one of the largest characterizations of post-acute sequelae of SARS-CoV-2 (PASC) to date, examining data from multiple disparate populations. Electronic healthcare record data can be used to characterize PASC; additional replication of our analysis on other databases could strengthen our findings.
Data Availability
Optum EHR and MarketScan CCAE are deidentified datasets available for licensure by Optum and IBM, respectively. The research done with data from Columbia University Irving Medical Center was approved by the IRB.