Outcomes evaluated in controlled clinical trials on the management of COVID-19: A methodological systematic review

Alexander G. Mathioudakis; Markus Fally; Rola Hashad; Ahmed Kouta; Ali Sina Hadi; Sean Blandin Knight; Nawar Diar Bakerly; Dave Singh; Paula R. Williamson; Timothy Felton; Jørgen Vestbo

doi:10.1101/2020.10.26.20218370

Abstract

It is crucial that randomized controlled trials (RCTs) on the management of coronavirus disease 2019 (COVID-19) evaluate the outcomes that are critical to patients and clinicians, to facilitate relevance, interpretability, and comparability.

This methodological systematic review describes the outcomes evaluated in 415 RCTs on the management of COVID-19, that were registered with ClinicalTrials.gov, by 5/5/2020.

Significant heterogeneity was observed in the selection of outcomes and the instruments used to measure them. Mortality, adverse events and treatment success or failure are only evaluated in 64.4%, 48.4% and 43% of the included studies, respectively, while other outcomes are selected less often. Studies focusing on more severe presentations (hospitalized patients or requiring intensive care) most frequently evaluate mortality and adverse events, while hospital admission and viral detection/load are most frequently assessed in the community setting. Outcome measurement instruments are poorly reported and heterogeneous. In general, simple instruments that can control for important sources of bias are favoured. Follow-up does not exceed one month in 64.3% of these earlier trials, and long-term COVID-19 burden is rarely assessed.

The methodological issues identified could delay the introduction of potentially life-saving treatments in clinical practice. Our findings demonstrate the need for consensus in the design of RCTs.

Take home message @ERSpublications: This systematic review describes the heterogeneity in outcomes evaluated in 415 RCTs on COVID-19 management and the instruments used to measure them. Our findings reveal a need for consensus in the design of future RCTs.

Background

Within a few months from its emergence, the Coronavirus Disease 2019 (COVID-19), characterized by an excessive infectivity, morbidity and mortality, evolved into a global, lethal pandemic¹. Faced with the risk of a public health disaster, the biomedical community sparked an unprecedented research mobilization aiming to understand the virus and develop effective preventive and therapeutic strategies². Characteristically, within ten months, over 60 thousand publications focusing on COVID-19 were indexed in the PubMed database and almost two thousand interventional studies were registered with the ClinicalTrials.gov database.

However, inevitably, the limited knowledge about the disease and the need for an expeditious response to the unfolding pandemic did not allow, in some cases, for adequate methodological planning and co-ordination. Firstly, extensive research duplication (or -better-multiplication) has been observed, with numerous randomized controlled trials (RCTs) evaluating the same interventions for COVID-19 in parallel³. This fragmentation of the research data could delay the acquisition of confident, actionable results regarding each treatment. Moreover, lack of standardization in trial design could limit comparability. An important source of variability in trial design could arise from the outcomes (endpoints) that are selected for evaluation. Heterogeneity in trial outcomes and omission of outcomes that are critically important to patients and clinicians complicate interpreting, comparing and synthesizing trial results, potentially delaying the introduction of novel, life-saving treatments into clinical practice^4,5.

Core outcome sets are developed to address heterogeneity in the selection of outcomes. These are agreed standardized sets of outcomes that should be measured and reported as a minimum in all clinical trials in specific areas of health or health care⁶. The Core Outcome Measures in Effectiveness Trials (COMET) has developed a rigorous methodology for their development, to ensure the most pertinent clinical outcomes are included in core outcome sets^6-8. COMET recommends the conduct of extensive methodological systematic reviews and qualitative research aiming to develop a longlist of all relevant outcomes for a disease area. This should be followed by a multi-stakeholder Delphi survey, aimed to prioritize the most pertinent outcomes for inclusion in the core outcome set^6-8.

Upon the emergence of COVID-19 pandemic, there was an urgent need for the development of a core outcome set. Within a few months, four core outcome sets were developed, using an accelerated process^9-12. These were based on methodological systematic reviews of the first registered RCTs, which were limited in number, but also in design, due to the limited knowledge of the nature of COVID-19, at the time. However, in the meantime, our knowledge of the natural history of COVID-19 is expanding rapidly and numerous clinical trials have been registered in clinical trial registries such as the ClinicalTrials.gov database. In this methodological survey, we describe the outcomes that are tested in RCTs evaluating therapeutic interventions for COVID-19 and the instruments used to measure these outcomes.

Methods

For the conduct of this methodological systematic review, we followed standard methodology recommended by the Core Outcome Measures in Effectiveness Trials (COMET) initiative⁶, that was successfully applied in previous, similar methodological surveys^13-15.

Study selection and data extraction

Planned, ongoing or completed interventional clinical trials evaluating pharmacological or non-pharmacological interventions for the management of COVID-19 were considered eligible. Phase 1 trials were considered beyond the scope of this manuscript and, thus, excluded. All eligible trials from the U.S. National Library of Medicine clinical trials register (ClinicalTrials.gov, searched on May 5^th, 2020) were retrieved using standard filters recommended by the library. More specifically, for identifying studies evaluating COVID-19, we used the following terms: COVID-19, SARS-CoV-2, severe acute respiratory syndrome coronavirus 2, 2019-nCoV, 2019 novel coronavirus, and Wuhan coronavirus. Only studies identified as interventional by the submitting researcher were retrieved.

Eligible studies were grouped into phase 2 or later stage trials, as we anticipated between-group differences in the selection of outcomes. In addition, trials are presented grouped according to the recruitment setting (community, hospital, or intensive care unit), which can be used as a surrogate measure of the severity of the participants recruited. The main methodological characteristics of all eligible studies were extracted automatically from the ClinicalTrials.gov extract (.csv), using a script developed in the platform R statistics (version 3.4.3; R Foundation for Statistical Computing, Vienna, Austria). One researcher (amongst MF, RH, ASH, AK) confirmed eligibility, cross-checked pre-extracted data for accuracy, searched for additional reports of the study protocol and extracted additional data, that were not automatically captured. A second researcher (AGM) cross-checked all extracted data for accuracy. Disagreement was resolved through discussion. Extracted data included the projected recruitment sizes, study settings, as well as details on the eligibility criteria and evaluated outcome measures.

Outcome grouping and classification

Detailed descriptions of all outcome measures were extracted verbatim from the study protocols or registry entries. After in-depth assessment of the outcomes evaluated in a random sample of 20 studies, we developed a list of generic outcome categories defined by the treatment effect they aim to capture, rather than the specific measurement instrument. For example, we chose the term “treatment success” to encapsulate outcomes such as the proportion of patients who were completely asymptomatic by a specific timepoint or those who experienced a clinical improvement of at least two points on the 9-category ordinal scale developed by the World Health Organisation (WHO) for COVID-19¹⁶, by a specific follow-up timepoint. Next, two authors (amongst MF, RH, ASH, AK) categorized each of the extracted outcomes within the generic outcome categories. New generic outcome categories were developed as needed, in cases where the evaluated outcomes did not fit any of the existing categories, based on consensus among the co-authors. The same investigators also captured the instruments used for the quantification of each outcome. Disagreement was resolved through discussion with another reviewer (AGM).

Finally, the generic outcomes were further classified according to the COMET taxonomy¹⁷.

Results

Description of the included studies

Our search retrieved 745 interventional studies. After excluding diagnostic, prognostic, preventive studies, phase 1 trials and those not directly focusing on the management of COVID-19, we selected 415 studies for inclusion in this systematic survey, including 178 phase 2, and 237 later phase RCTs. Details on the selection process (PRISMA flowchart, supplementary figure 1), as well as the trial registration numbers, and planned study population of all included studies are available in the online appendix.

Figure 1. Duration of follow-up in the included studies.

Most of the included trials are conducted by academic investigators (75.7%) and only one in four is sponsored by the pharmaceutical industry. The planned recruitment ranges between 7 and 12,000 participants (median: 160, interquartile range [IQR]: 67-400). Most trials include two intervention arms (74.8%), but one in four evaluates more than two, and up to 19 interventions. Moreover, 79.8% of the trials are conducted in a hospital setting, including 6.5% conducted in the intensive care unit (ICU), while 15.2% are conducted in the community. Descriptions of disease severity are heterogeneous with the recruitment setting being the most consistent measure of disease. Details on the characteristics of the included studies are available in table 1.

View this table:

Table 1.

Characteristics of the included studies. *Studies conducted in multiple continents are counted in each participating continent.

Overall, 3,948 unique outcomes are evaluated in the included studies, including 1,691 from phase 2 trials and 2,257 from later phase trials. We identified 25 generic outcome categories, which are described in table 2. Similar number of outcomes are evaluated in phase 2 (median: 8.5, IQR: 5-13) and later phase (median: 7, IQR: 4-11) trials (supplementary figures 2, 3). While mortality and adverse events are the most frequently assessed outcomes, they are only assessed in 64.6% and 48.4% of all trials, respectively. All remaining outcomes are evaluated in less than half of the trials, highlighting an important heterogeneity in outcomes selection (table 3). Treatment success or failure is only evaluated in 41.6% of phase 2 trials and 44.1% of the later phase trials. Interestingly, the frequency that different outcomes are evaluated as outcomes or as primary outcomes, are very similar for phase 2 and later phase trials.

View this table:

Table 2. Definitions of the generic outcome categories.

View this table:

Table 3a.

Frequency that outcome measures are reported in RCTs on the management of COVID-19. Outcomes evaluated in <10 RCTs were grouped as “Other outcomes”. Time to treatment success or failure is a measurement instrument of the outcome treatment success or failure. However, it is reported separately here, as it provides more granular information. A. Grouped in phase 2 and later phase trials. B. Grouped by recruitment setting (community, hospital, ICU). NIV: Non-invasive ventilation. * Continued need of hospital/critical care admission, at a specific timepoint.

View this table:

Table 3b.

Figure 2.

Planned follow-up timepoints for the most frequently evaluated outcomes. All timepoints described in each of the included trials were included in this figure. Presented as a percentage of the outcomes of the same category.

The most frequently reported outcomes among studies conducted in a community setting (thus recruiting less severely ill patients), were viral detection or load (55.6%), and the need for hospital admission (50.8%). In contrast, the most frequently evaluated outcomes in studies recruiting patients with more severe COVID-19, were mortality and adverse events, which were evaluated in 71.6% and 50.3% of studies recruiting hospitalized patients, and in 88.9% and 66.7% of those recruiting critically ill patients, respectively (table 3).

Outcome measurement instruments

Mortality/ survival

Mortality is assessed by 284 outcomes. All-cause mortality is evaluated in all but six trials measuring mortality. When mortality was not further described, we presumed it referred to all-cause mortality. Time to death is assessed in 16 trials, while cause-specific mortality is evaluated in six trials, mainly focusing on SARS-CoV2 mortality, but also including mortality due to pulmonary or due to cardiovascular complications.

Clinical outcomes

(Time to) Treatment success or treatment failure

Treatment success or the time to treatment success was evaluated by 220 outcomes. Ordinal scales describing different levels of COVID-19 severity are used for assessing treatment success in 113 (51.4%) of these outcomes. Most scales are very similar to the most frequently used WHO scale, which is a 9-point ordinal scale (from 0 to 8), with each point describing a worse clinical status: no clinical or virological evidence of infection; ambulatory without limitation in daily activities; ambulatory but with limitation in daily activities; hospitalized but not requiring oxygen therapy; hospitalized and requiring oxygen therapy via a face mask or nasal prongs; requiring non-invasive ventilation (NIV) or high-flow oxygen; requiring intubation and mechanical ventilation; requiring ventilation and additional organ support, including vasopressors, renal replacement therapy or extra-corporeal membrane oxygenation (ECMO); deceased¹⁶. Treatment success is defined as an improvement in ordinal scales such as the WHO clinical progression scale by 2 points or 1 point in 57.5% and 24.8% of all outcomes using the scale to evaluate treatment success, while in the remaining outcomes, no specific threshold is provided. Complete resolution of the symptoms and signs of COVID-19 (clinical recovery) is used as a measure of treatment success in 51/220 (23.2%) outcomes and clinical improvement in 38/220 (17.3%) outcomes. The definition of complete resolution varies. Often, no further information is provided. In the remaining cases, it is defined as a composite outcome including several of the following components: complete resolution of breathlessness, tachypnoea, hypoxia, desaturation, cough, anosmia, myalgia, fever, or of oxygen requirements; a negative COVID-19 PCR; hospital discharge; or radiological resolution. A definition of clinical improvement as an outcome is also frequently lacking. In the remaining cases, it is defined as an improvement in several of the previously listed components. Improvement is either based on prespecified thresholds, or on a subjective clinicians’ judgement. Finally, 14 outcomes (6.4%), use specific thresholds (0, ≤2 or ≤4) of the National Early Warning Score (NEWS or NEWS-2) to define treatment success.

Treatment failure, or time to treatment failure is evaluated by 76 outcomes. In most cases (40/76, 52.6%), treatment failure is defined as a composite outcome consisting of several components with clear thresholds, such as: death, need for ICU admission, need for invasive ventilation, need for other organ support (e.g. vasopressors or renal replacement therapy), need for NIV, need for supplemental oxygen, a deterioration in oxygenation, need for hospital admission or re-admission or emergency visit, ventricular tachyarrhythmia. Ordinal clinical severity scales such as the WHO scale are used to define treatment failure in 16/76 (21.1%) outcomes, while the need for rescue therapy is used in 9/76 (11.8%) outcomes. The definitions are less well defined in the remaining 11 (14.5%) outcomes, which do not provide any criteria or state treatment failure will be based on the clinician’s judgement of deterioration in the clinical condition of the patient.

Severity scores

Standardized scores are used to evaluate disease severity and progression in 277 outcomes. In this category we included outcomes presenting mean/median scores or change from baseline in a score. Outcomes describing predefined scores thresholds for treatment success or failure were classified in the previous category. Ordinal disease severity scales (such as the WHO scale) are the most frequently used score (144/277 outcomes, 51.2%), followed by the Sequential Organ Failure Assessment (SOFA) Score¹⁸, a validated score for describing the severity of organ dysfunction (54/277 outcomes, 19.5%), and the National Early Warning Score (NEWS or NEWS2, 36/277 outcomes, 13.0%), a tool developed by the Royal College of Physicians, that quantifies clinical severity and deterioration by evaluating six physiological measures, namely respiratory rate, oxygen saturation, temperature, systolic blood pressure, heart rate and the level of consciousness¹⁹. Acute Physiology And Chronic Health Evaluation II (APACHE II, 5/277), clinical sign score (5/277), Pneumonia Severity Index (PSI, 3/277), BRESCIA-COVID, Murray score, Sepsis Induced Coagulopathy, Small Identification Test, SMART-COP score, and the Vienna Vaccine Safety Initiative (ViVI) disease severity score are used less often.

Symptoms

188 outcomes focus on symptoms, which are either assessed using visual analogue scales, or validated instruments. Composite scores evaluating several symptoms, including breathlessness, cough, sputum production, pyrexia, anosmia, myalgia, headache, or gastrointestinal symptoms, are evaluated in 40 outcomes (21.3%). Four composite outcomes specifically assess respiratory symptoms (2.2%). Each of the remaining outcomes focus on a specific symptom. These include fever (72/188, 38.3%), breathlessness (18, 9.6%), cough (12, 6.4%), and less often anxiety, depressive symptoms, anosmia, cognitive dysfunction, nausea, insomnia or fatigue. In this category we also included the assessment of heart rate (8, 4.3%) or blood pressure (5, 2.7%).

Physiological outcomes

Oxygenation

Oxygenation parameters are evaluated by 215 outcomes. The need for supplementary oxygen or ventilation were summarized in separate outcome categories. Oxygenation is evaluated using the partial pressure of oxygen (PaO²), fraction of inspired oxygen (FiO²), oxygen saturation (SatO²), or respiratory rate. In this category we also included measurements of the partial pressure of carbon dioxide (PaCO²) and pH, which are only rarely evaluated as outcomes. Oxygenation is often measured as the PaO² or SatO² corrected for FiO² (95/215, 44.2%).

Pulmonary function and physiology

Twenty-eight outcomes assessed pulmonary function or physiology. There is significant heterogeneity in this domain, with different outcomes evaluating peak flow rate, forced vital capacity (FVC), the ratio of forced expiratory volume in 1 second (FEV¹) to FVC, vital capacity, diffusing capacity, lung compliance and respiratory muscle function.

Viral detection and load

Viral detection and load is evaluated in 235 outcomes. The vast majority assess virologic clearance by a specific timepoint, or the time until virologic clearance. A small number of outcomes track changes in viral load over time, or differences in the viral detection and load when using different samples (nasal, nasopharyngeal, oropharyngeal swabs or sputum).

Viral antibodies

The development of antibodies against SARS-CoV2 is assessed in 31 outcomes. Evaluation of specific antibody types (IgA, IgG or IgM) is only described in five trials.

Radiological outcomes

Sixty-one outcomes evaluate disease progression radiologically. Definitions of this outcome are inadequate. In most cases, it is broadly stated that the progression, regression, or resolution of the radiological findings are monitored. Details are only provided in six outcomes, which monitor the extent of the lesion as a proportion of the full lung volume, or perform lung densitometry. Development of fibrosis is evaluated in seven outcomes. Computed tomography (CT) is used in 21 (34.4%) outcomes, a chest X-ray (CXR) in 8 (13.1%), either a CT or a CXR in 3, either CT or CXR or lung ultrasound in one and nuclear imaging in one outcome. The imaging modality used is not declared in the remaining 28 (45.9%) outcomes.

Inflammatory biomarkers

This group includes 321 outcomes, each describing either a single or multiple inflammatory biomarkers. The most frequently evaluated biomarkers are the total white cell count, neutrophils, lymphocytes, eosinophils, monocytes, c-reactive protein, interleukins 1, 6 and 8, followed by other interleukins, procalcitonin, tumour necrosis factors, complement components, lymphocytes subtypes, immunoglobulins, and other inflammatory biomarkers.

Other biomarkers

309 outcomes evaluate either a single or multiple non-inflammatory biomarkers. Mostly, these are surrogates for safety or adverse events. The most frequently captured biomarkers are d-dimers, cardiac enzymes, kidney function, liver function, clotting, red blood cells and haemoglobin, followed by a variety of other molecules.

Pharmacokinetics/ Pharmacodynamics

We categorized 33 outcomes in this category, mostly evaluating plasma drug concentrations (12/33, 36.4%), but also half time, maximum/minimum observed concentration, time to reach the maximum/minimum observed concentration, area under the plasma concentration-time curve.

Adverse events

448 outcomes focus on adverse events. 108 (24.1%) evaluate any adverse event; either their frequency, or participants experiencing at least one adverse event. 80 (17.9%) outcomes specifically assess serious adverse events, as defined by the Common Terminology Criteria for Adverse Events (CTCAE). Nineteen (4.2%) outcomes focused on drug reactions, 14 (3.1%) on grade 3 or 4 adverse events, as defined by the CTCAE, and 22 (4.9%) assessed the rate of study drugs discontinuation due to adverse events or due to any reason. The remaining outcomes focused on specific adverse events, mostly cardiac (38, 10.3%), secondary infections (37, 10.0%), thrombotic or bleeding events (29, 8.1%), or local administration reactions (13, 3.6%)

Life impact

Life impact is evaluated by 13 outcomes. The EuroQol 5 Dimensions (EQ-5D) is used in four outcomes, followed by the RAND 36-Item Health Survey (SF-36), which is used in three outcomes. Other instruments include the WHO Disability Assessment Schedule (WHODAS 2.0), the Control, Autonomy and Pleasure (CASP-19) and the Nottingham Health Profile.

Resources use

Need for a (higher) level of care

352 outcomes are included in this category. Need for hospital admission is evaluated by 68 outcomes (19.3%), need for hospital re-admission by 9 (2.6%), need for intensive care admission by 82 (23.4%), need for invasive ventilation by 167 (47.4%), and need for extracorporeal membrane oxygenation (ECMO) by 26 (7.4%; need for ECMO is merged with the outcome need for ventilation in the tables). In studies conducted in the hospital setting, need for hospital admission at a specific follow-up timepoint, refers to the proportion of patients who remain inpatients at that timepoint. Similarly, for studies conducted in the ICU stay and the need for ICU admission. In this category, we also included composite outcomes consisting of one of the above outcomes and mortality (e.g. need for ICU admission or death), as these composite outcomes focus on the need for a higher level of care, while death is added to account for patients who decease before accessing the higher level of care, or those who are not eligible for higher level of care due to their baseline clinical status. Such approaches could be crucial to account for bias, especially in situations such as the COVID-19 pandemic, when hospitals and ICUs are over-burdened and not infrequently unable to accommodate a significant proportion of the patients, leading to the introduction of stricter criteria for triaging patients. Moreover, some outcomes in this category also evaluate time-to-higher level of care (e.g. time-to-hospital admission).

Duration of stay in a specific level of care

This category includes 469 outcomes. Of those, 206 (43.9%) focus on the length of hospital stay, 96 (20.5%) on the length of ICU stay, and 167 (35.6%) on the duration of invasive ventilation. Delays in discharging patients who are medically optimized due to social or other reasons could introduce bias in the outcome length of hospital stay. To account for this issue, 11 outcomes are defined as the time to discharge or to a NEWS ≤2, maintained for 24 hours and another outcome as the time until participants are deemed medically optimized for discharge by a clinician.

Need for supplemental oxygen or NIV

This category includes 105 outcomes evaluating the need for supplemental oxygen or NIV in any setting. Most evaluate the need for supplemental oxygen administration at specific follow-up timepoints; 34 (32.4%) outcomes assess the need for NIV (including continuous positive airway pressure [CPAP] or bilevel positive airway pressure [BiPAP]), and 21 (20.0%) the need for high-flow oxygen. One outcome evaluates the need for domiciliary oxygen after hospital discharge. On several occasions, the need for supplemental oxygen was defined on the basis of ordinal clinical severity scales (such as the WHO scale), as the proportion of participants grouped in milder categories, not requiring supplemental oxygen.

Duration of supplemental oxygen or NIV

95 outcomes were included in this category. Twelve (12.6%) evaluate the duration of NIV, and seven (7.4%) the duration of high-flow oxygen.

Need for other organ support

Forty-four outcomes focus on the need for organ support other than invasive ventilation, including 26 (59.1%) assessing the need for vasopressors, and 18 (40.9%) for renal replacement therapy.

Other outcomes

Here, we grouped 145 outcomes that could not be categorized in the previous categories and were evaluated in <10 RCTs each. Need for concurrent treatments are assessed in 22 outcomes, including 7 that specifically focus on the administration of antibiotics. Exercise capacity is assessed by 13 outcomes (mostly using the 6-minutes walking test), COVID-19 transmission by 9, resource requirements and costs by 8 outcomes. Other outcomes include the use of prone positioning, ability to perform activities of daily living, incidence and progression of cytokine storm syndrome, resilience, lost workdays and discharge destinations.

Study follow-up

Planned follow-up for all included studies varies from less than a week, to over a year (figure 1, supplementary figure 4). However, in most cases, it does not exceed one month (263/415 63.4%). Follow-up exceeds four months only in 50 (12.0%) studies and one year only in one. Follow-up plans do not differ between phase 2 and later phase trials, where it are limited to one month or less in 105/178 (59.0%) and in 158/237 (66.7%) trials, respectively. Longer-term follow-up, exceeding 4 months, is planned for 163 outcomes (figure 2, supplementary figure 5), evaluating mortality (16 outcomes), adverse events (15), life impact (12), severity scores (12), length of hospital stay (11), viral detection and load (11), inflammatory biomarkers (7), pulmonary function/physiology (6), need for ventilation (5) and duration of ventilation (5).

Discussion

In this methodological survey, we analysed the outcomes and outcome measurement instruments used in 415 RCTs evaluating therapeutic interventions for COVID-19. We identified a remarkable heterogeneity in the selection of outcomes, that is not unexpected given that these trials were designed within a few months from the emergence of the new coronavirus strain. More specifically, only 64.6% and 48.4% of the studies evaluate mortality and adverse events, respectively, while each of the remaining outcomes is assessed by markedly less than half of the studies.

Variability was also observed in the choice of instruments used to measure different outcomes. Ordinal clinical severity scales, such as the WHO clinical progression scale were consistently used across the included studies to assess treatment success or failure, disease severity and oxygen requirements. Given the acute nature of COVID-19, and significant changes in the clinical status of patients in the course of the disease, such scales can effectively capture disease progression, especially in more severe presentations. Most of these scales follow the structure of the WHO scale, removing scale points for simplicity. Despite sharing a similar structure, these scales group patients differently, limiting interpretability and comparability. The WHO recently introduced a revised 11-point Scale, with increased granularity, and it would be advisable for all studies to align relevant outcomes with this revised scale, to improve interpretability and comparability¹². To evaluate treatment success or failure, most studies used a 2-point change in the ordinal scale as a threshold, that corresponds to a significant change in the clinical status of the patient and this seems appropriate.

Our study revealed a lack of focus on the long-term sequelae of SARS-CoV2 infection. The planned study follow-up exceeds four months only in 12% of all studies. Moreover, only 13 trials assess life impact beyond the acute phase, while exercise capacity is assessed by 13 trials, and the ability to perform simple daily activities during convalescence in only four trials. Finally, only seven trials stated an intent to explore the development of pulmonary fibrosis. However, persistent symptoms, such as fatigue or breathlessness, and quality of life deficits are detected in many hospitalized patients, two to three months after discharge^20-22. Moreover, fibrotic changes are detected in about one in three survivors of a hospitalization for COVID-19 infection^23,24. However, it should be noted that we evaluated RCTs registered by May 2020 and longer-term follow-up may have been planned for newer studies, in view of the emerging data.

While this study did not focus on the analytical approaches used for evaluating outcomes, we observed that several studies described specific approaches to account for the bias introduced by mortality as a competing factor for other outcomes, including the duration of hospital stay, ICU stay and the duration of respiratory support. Several methods were described to account for this bias. Some studies stated the duration of hospital or ICU stay will be censored for deceased participants, while others assessed the days that participants are alive and out of hospital or ICU, instead. Homogenization and detailed description of the analytical approaches in the study protocols, along with the outcomes and outcome measurement instruments are crucial for increasing transparency and comparability. Future methodological studies should address analytical approaches.

Four core outcome sets have already been published, with overlapping but not identical selection of components. The WHO Working Group on the Clinical Characterisation and Management of COVID-19 infection recommends the minimal use of three outcomes: mortality, viral burden and non-mortal clinical outcomes evaluated using the WHO clinical progression scale¹². WHO also highlighted the need for a longer follow-up, of at least 60 days, to capture disease mortality, which is not adopted by most identified trials. Two other groups prioritized specific outcomes and measurement instruments, all of which were captured in our analysis, but were not necessarily the most frequently used^10,11. The last core outcome set prioritized broader domains to be addressed, rather than specific outcomes⁹. These domains encompass most outcomes identified in this methodological review. The same group also highlighted the need to evaluate the impact of COVID-19 on patient status and life impact in the longer term. Looking across these core outcome sets, a meta-core outcome set (meta-COS) was identified, only including the two domains that were prioritized by all initiatives (mortality and respiratory support), as the most critical, to be evaluated in all future RCTs in hospitalised patients²⁵. Both domains recommended by the meta-COS were evaluated in 205 (49.4%) of the included studies.

In view of the multiple available core outcome sets, the authors of this review believe that outcomes selection for future trials should (i) adhere to the recommendations by the WHO and the meta-COS, and (ii) attempt to address all of the domains proposed by Tong et al, a core outcome set that was informed by consensus of >9,000 participants⁹.

Methodological systematic reviews were conducted as part of the development of three core outcome sets. However, these reviews were almost exclusively based on studies conducted in China. Moreover, two of these reviews included approximately 100 RCT protocols^10,11, while the WHO document was informed by 1,135 protocols, including both observational and interventional studies¹². However, the outcomes of RCTs often differ from those selected in observational studies. Our methodological review was based on a globally representative sample of 415 RCTs, it employed more rigorous methodology to assess all outcomes, and it is the first review to evaluate the instruments used to evaluate the different outcomes beyond mortality.

Our study only included clinical trials that were registered until May 2020 and this may be a limitation as trial designs and endpoints may have evolved since then, in view of the emerging knowledge on the nature and outcomes of COVID-19 infection, and the published core outcome sets. Moreover, we only evaluated studies registered with the U.S. National Library of Medicine clinical trials register (ClinicalTrials.gov). However, our extensive, globally representative sample of 415 ongoing RCTs was a major strength of our methodological survey and we strongly believe it was sufficient to capture all relevant outcomes and measurement instruments. Characteristically, after extracting data from approximately 25% of the included trials, it became clear that we reached saturation with regards to the outcome categories, while by the time we extracted approximately 70% of the trials, we also reached saturation with regards to the outcome measurement instruments. Therefore, we are confident that we have not missed important outcomes, although we anticipate that newer outcomes may be introduced in newer trials, in response to our expanding knowledge on COVID-19 natural history and outcomes. Future studies will need to assess the impact of the emerging evidence on the natural history and outcomes of COVID-19 and of the four published core outcome sets and the meta-COS on the selection of outcomes in more recently registered trials. Another limitation of our study is the lack of a prospectively registered protocol. However, we have used rigorous methodology recommended by the COMET Initiative, that we have previously employed in similar methodological systematic reviews¹³.

Overall, this methodological survey reveals significant heterogeneity in the outcome categories and measurement instruments selected by trialists in the management of COVID-19 and highlights the need for greater consistency, to enable decision-makers to compare and contrast studies.

Data Availability

Not applicable - this is a methodological systematic review

Footnotes

Conflicts of interest: All authors have completed the ICMJE uniform disclosure form detailing any conflicts of interest outside the submitted work that they may have. None of the authors have conflicts directly related to this work.
Funding: This study was supported by the NIHR Manchester Biomedical Research Centre (BRC).

References

1.↵
Rothan HA, Byrareddy SN. The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. J Autoimmun 2020;109:102433.
OpenUrl CrossRef PubMed
2.↵
Fragkou PC, Belhadi D, Peiffer-Smadja N, et al. Review of trials currently testing treatment and prevention of COVID-19. Clin Microbiol Infect 2020;26:988–98.
OpenUrl PubMed
3.↵
Mathioudakis AG, Fally M, Hashad R, Knight S, Felton T, Vestbo J. COVID-19 Clinical Trials: Unravelling a Methodological Gordian Knot. Am J Respir Crit Care Med 2020.
4.↵
Kirkham JJ, Gargon E, Clarke M, Williamson PR. Can a core outcome set improve the quality of systematic reviews?--a survey of the Co-ordinating Editors of Cochrane Review Groups. Trials 2013;14:21.
OpenUrl CrossRef PubMed Web of Science
5.↵
Roche N, Tonia T, Bush A, et al. Guidance production before evidence generation for critical issues: the example of COVID-19. Eur Respir Rev 2020;29.
6.↵
Williamson PR, Altman DG, Bagley H, et al. The COMET Handbook: version 1.0. Trials 2017;18:280.
OpenUrl CrossRef PubMed
7.
Kirkham JJ, Gorst S, Altman DG, et al. Core Outcome Set-STAndardised Protocol Items: the COS-STAP Statement. Trials 2019;20:116.
OpenUrl PubMed
8.↵
Kirkham JJ, Gorst S, Altman DG, et al. Core Outcome Set-STAndards for Reporting: The COS-STAR Statement. PLoS Med 2016;13:e1002148.
OpenUrl PubMed
9.↵
Tong A, Elliott JH, Azevedo LC, et al. Core Outcomes Set for Trials in People With Coronavirus Disease 2019. Crit Care Med 2020.
10.↵
Qiu R, Zhao C, Liang T, et al. Core Outcome Set for Clinical Trials of COVID-19 Based on Traditional Chinese and Western Medicine. Front Pharmacol 2020;11:781.
OpenUrl
11.↵
Jin X, Pang B, Zhang J, et al. Core Outcome Set for Clinical Trials on Coronavirus Disease 2019 (COS-COVID). Engineering (Beijing) 2020.
12.↵
Characterisation WHOWGotC, Management of C-i.A minimal common outcome measure set for COVID-19 clinical research. Lancet Infect Dis 2020;20:e192–e7.
OpenUrl CrossRef PubMed
13.↵
Mathioudakis AG, Moberg M, Janner J, Alonso-Coello P, Vestbo J. Outcomes reported on the management of COPD exacerbations: a systematic survey of randomised controlled trials. ERJ Open Res 2019;5:00072–2019.
OpenUrl
14.
Mathioudakis AG, Abroug F, Agusti A, et al. Core Outcome Set for the management of Acute Exacerbations of Chronic Obstructive Pulmonary Disease. The COS-AECOPD ERS Task Force study protocol. ERJ Open Res 2020;In press.
15.↵
Alkhaffaf B, Blazeby JM, Williamson PR, Bruce IA, Glenny AM. Reporting of outcomes in gastric cancer surgery trials: a systematic review. BMJ Open 2018;8:e021796.
OpenUrl Abstract/FREE Full Text
16.↵
World Health Organization. Novel Coronavirus COVID-19 therapeutic trial synopsis. Who R&D Blueprint 2020.
17.↵
Dodd S, Clarke M, Becker L, Mavergames C, Fish R, Williamson PR. A taxonomy has been developed for outcomes in medical research to help improve knowledge discovery. J Clin Epidemiol 2018;96:84–92.
OpenUrl PubMed
18.↵
Vincent JL, Moreno R, Takala J, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med 1996;22:707–10.
OpenUrl CrossRef PubMed Web of Science
19.↵
McGinley A, Pearse RM. A national early warning score for acutely ill patients. BMJ 2012;345:e5310.
OpenUrl FREE Full Text
20.↵
Garrigues E, Janvier P, Kherabi Y, et al. Post-discharge persistent symptoms and health-related quality of life after hospitalization for COVID-19. J Infect 2020.
21.
Halpin SJ, McIvor C, Whyatt G, et al. Postdischarge symptoms and rehabilitation needs in survivors of COVID-19 infection: A cross-sectional evaluation. J Med Virol 2020.
22.↵
Carfi A, Bernabei R, Landi F, Gemelli Against C-P-ACSG. Persistent Symptoms in Patients After Acute COVID-19. JAMA 2020;324:603–5.
OpenUrl CrossRef PubMed
23.↵
Mo X, Jian W, Su Z, et al. Abnormal pulmonary function in COVID-19 patients at time of hospital discharge. Eur Respir. 2020;55.
24.↵
George PM, Barratt SL, Condliffe R, et al. Respiratory follow-up of patients with COVID-19 pneumonia. Thorax 2020.
25.↵
Marshall J, Zhang J, Qiu R, Blackwood B, Sinha I, Tong A, Gargon L, Williamson P. The ‘metacos’ for research in COVID-19 hospitalised patients. Available from: http://www.comet-initiative.org/assets/downloads/COVID-19%20meta%20COS_Table%201_7th%20July%202020.pdf.

View the discussion thread.

Posted October 27, 2020.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Respiratory Medicine

Subject Areas

All Articles

Addiction Medicine (316)
Allergy and Immunology (621)
Anesthesia (162)
Cardiovascular Medicine (2298)
Dentistry and Oral Medicine (280)
Dermatology (202)
Emergency Medicine (371)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (818)
Epidemiology (11623)
Forensic Medicine (10)
Gastroenterology (683)
Genetic and Genomic Medicine (3629)
Geriatric Medicine (341)
Health Economics (622)
Health Informatics (2332)
Health Policy (919)
Health Systems and Quality Improvement (871)
Hematology (336)
HIV/AIDS (758)
Infectious Diseases (except HIV/AIDS) (13202)
Intensive Care and Critical Care Medicine (760)
Medical Education (361)
Medical Ethics (101)
Nephrology (394)
Neurology (3390)
Nursing (193)
Nutrition (512)
Obstetrics and Gynecology (653)
Occupational and Environmental Health (654)
Oncology (1777)
Ophthalmology (526)
Orthopedics (211)
Otolaryngology (284)
Pain Medicine (226)
Palliative Medicine (66)
Pathology (441)
Pediatrics (1013)
Pharmacology and Therapeutics (423)
Primary Care Research (410)
Psychiatry and Clinical Psychology (3102)
Public and Global Health (6021)
Radiology and Imaging (1238)
Rehabilitation Medicine and Physical Therapy (720)
Respiratory Medicine (814)
Rheumatology (370)
Sexual and Reproductive Health (359)
Sports Medicine (319)
Surgery (390)
Toxicology (50)
Transplantation (171)
Urology (143)

[1] 1.↵
Rothan HA, Byrareddy SN. The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. J Autoimmun 2020;109:102433.
OpenUrl CrossRef PubMed

[2] 2.↵
Fragkou PC, Belhadi D, Peiffer-Smadja N, et al. Review of trials currently testing treatment and prevention of COVID-19. Clin Microbiol Infect 2020;26:988–98.
OpenUrl PubMed

[3] 3.↵
Mathioudakis AG, Fally M, Hashad R, Knight S, Felton T, Vestbo J. COVID-19 Clinical Trials: Unravelling a Methodological Gordian Knot. Am J Respir Crit Care Med 2020.

[4] 4.↵
Kirkham JJ, Gargon E, Clarke M, Williamson PR. Can a core outcome set improve the quality of systematic reviews?--a survey of the Co-ordinating Editors of Cochrane Review Groups. Trials 2013;14:21.
OpenUrl CrossRef PubMed Web of Science

[5] 5.↵
Roche N, Tonia T, Bush A, et al. Guidance production before evidence generation for critical issues: the example of COVID-19. Eur Respir Rev 2020;29.

[6] 6.↵
Williamson PR, Altman DG, Bagley H, et al. The COMET Handbook: version 1.0. Trials 2017;18:280.
OpenUrl CrossRef PubMed

[7] 7.
Kirkham JJ, Gorst S, Altman DG, et al. Core Outcome Set-STAndardised Protocol Items: the COS-STAP Statement. Trials 2019;20:116.
OpenUrl PubMed

[8] 8.↵
Kirkham JJ, Gorst S, Altman DG, et al. Core Outcome Set-STAndards for Reporting: The COS-STAR Statement. PLoS Med 2016;13:e1002148.
OpenUrl PubMed

[9] 9.↵
Tong A, Elliott JH, Azevedo LC, et al. Core Outcomes Set for Trials in People With Coronavirus Disease 2019. Crit Care Med 2020.

[10] 10.↵
Qiu R, Zhao C, Liang T, et al. Core Outcome Set for Clinical Trials of COVID-19 Based on Traditional Chinese and Western Medicine. Front Pharmacol 2020;11:781.
OpenUrl

[11] 11.↵
Jin X, Pang B, Zhang J, et al. Core Outcome Set for Clinical Trials on Coronavirus Disease 2019 (COS-COVID). Engineering (Beijing) 2020.

[12] 12.↵
Characterisation WHOWGotC, Management of C-i.A minimal common outcome measure set for COVID-19 clinical research. Lancet Infect Dis 2020;20:e192–e7.
OpenUrl CrossRef PubMed

[13] 13.↵
Mathioudakis AG, Moberg M, Janner J, Alonso-Coello P, Vestbo J. Outcomes reported on the management of COPD exacerbations: a systematic survey of randomised controlled trials. ERJ Open Res 2019;5:00072–2019.
OpenUrl

[14] 14.
Mathioudakis AG, Abroug F, Agusti A, et al. Core Outcome Set for the management of Acute Exacerbations of Chronic Obstructive Pulmonary Disease. The COS-AECOPD ERS Task Force study protocol. ERJ Open Res 2020;In press.

[15] 15.↵
Alkhaffaf B, Blazeby JM, Williamson PR, Bruce IA, Glenny AM. Reporting of outcomes in gastric cancer surgery trials: a systematic review. BMJ Open 2018;8:e021796.
OpenUrl Abstract/FREE Full Text

[16] 16.↵
World Health Organization. Novel Coronavirus COVID-19 therapeutic trial synopsis. Who R&D Blueprint 2020.

[17] 17.↵
Dodd S, Clarke M, Becker L, Mavergames C, Fish R, Williamson PR. A taxonomy has been developed for outcomes in medical research to help improve knowledge discovery. J Clin Epidemiol 2018;96:84–92.
OpenUrl PubMed

[18] 18.↵
Vincent JL, Moreno R, Takala J, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med 1996;22:707–10.
OpenUrl CrossRef PubMed Web of Science

[19] 19.↵
McGinley A, Pearse RM. A national early warning score for acutely ill patients. BMJ 2012;345:e5310.
OpenUrl FREE Full Text

[20] 20.↵
Garrigues E, Janvier P, Kherabi Y, et al. Post-discharge persistent symptoms and health-related quality of life after hospitalization for COVID-19. J Infect 2020.

[21] 21.
Halpin SJ, McIvor C, Whyatt G, et al. Postdischarge symptoms and rehabilitation needs in survivors of COVID-19 infection: A cross-sectional evaluation. J Med Virol 2020.

[22] 22.↵
Carfi A, Bernabei R, Landi F, Gemelli Against C-P-ACSG. Persistent Symptoms in Patients After Acute COVID-19. JAMA 2020;324:603–5.
OpenUrl CrossRef PubMed

[23] 23.↵
Mo X, Jian W, Su Z, et al. Abnormal pulmonary function in COVID-19 patients at time of hospital discharge. Eur Respir. 2020;55.

[24] 24.↵
George PM, Barratt SL, Condliffe R, et al. Respiratory follow-up of patients with COVID-19 pneumonia. Thorax 2020.

[25] 25.↵
Marshall J, Zhang J, Qiu R, Blackwood B, Sinha I, Tong A, Gargon L, Williamson P. The ‘metacos’ for research in COVID-19 hospitalised patients. Available from: http://www.comet-initiative.org/assets/downloads/COVID-19%20meta%20COS_Table%201_7th%20July%202020.pdf.

Outcomes evaluated in controlled clinical trials on the management of COVID-19: A methodological systematic review

Abstract

Background

Methods

Study selection and data extraction

Outcome grouping and classification

Results

Description of the included studies

Outcome measurement instruments

Mortality/ survival

Clinical outcomes

(Time to) Treatment success or treatment failure

Severity scores

Symptoms

Physiological outcomes

Oxygenation

Pulmonary function and physiology

Viral detection and load

Viral antibodies

Radiological outcomes

Inflammatory biomarkers

Other biomarkers

Pharmacokinetics/ Pharmacodynamics

Adverse events

Adverse events

Life impact

Resources use

Need for a (higher) level of care

Duration of stay in a specific level of care

Need for supplemental oxygen or NIV

Duration of supplemental oxygen or NIV

Need for other organ support

Other outcomes

Study follow-up

Discussion

Data Availability

Footnotes

References

Citation Manager Formats

Subject Area