The methodological quality of COVID-19 systematic reviews is low, except for Cochrane reviews: a meta-epidemiological study

Objectives: The objective of this study was to investigate the methodological quality of COVID-19 systematic reviews (SRs) indexed in medRxiv and PubMed, compared with Cochrane COVID Reviews. Study Design and Setting: This is a cross-sectional meta-epidemiological study. We searched medRxiv, PubMed, and Cochrane Database of Systematic Reviews for SRs of COVID-19. We evaluated the methodological quality using A MeaSurement Tool to Assess systematic Reviews (AMSTAR) checklists. The maximum AMSTAR score is 11, and minimum is 0. Higher score means better quality. Results: We included 9 Cochrane reviews as well as randomly selected 100 non-Cochrane reviews in medRxiv and PubMed. Compared with Cochrane reviews (mean 9.33, standard deviation 1.32), the mean AMSTAR scores of the articles in medRxiv were lower (mean difference -2.85, 95%confidence intervals (CI): -0.96 to -4.74) and those in PubMed was also lower (mean difference -3.28, 95% CI: -1.40 to -5.15), with no difference between the latter two. Conclusions: It should be noted that AMSTAR is not a perfect tool of assessing quality SRs other than intervention. Readers should pay attention to the potentially low methodological quality of COVID-19 SRs in both PubMed and medRxiv but less so in Cochrane COVID reviews.

Keywords: preprints; peer-review; systematic review; methodological quality; meta-epidemiological study;  Running title The methodological quality of COVID-19 systematic reviews is low Word count: Abstract: 194 words, text: 2110 words . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Key findings
-The methodological quality of COVID-19 systematic reviews (SRs) in medRxiv and PubMed were lower than Cochrane COVID reviews.
-The methodological quality of reviews in medRxiv and PubMed did not differ.

What this study adds to what was known
-Expert opinions and a preliminary review suggested the low quality of COVID-19 SRs but this hypothesis has not been examined empirically.
-We evaluated the methodological quality of COVID-19 SRs using comprehensive search and confirmed that the quality was low except for Cochrane reviews.

What is the implication and what should change now?
Readers should pay attention to the potentially low methodological quality of COVID-19 SRs in both PubMed and medRxiv but less so in Cochrane COVID reviews.
The methodological quality of COVID-19 SRs except for Cochrane COVID reviews needed to be improved.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
Cochrane started Cochrane COVID Reviews to answer the time-sensitive needs of health decision makers as fast as possible. These reviews are intended to simultaneously assure the scientific quality [5].
While the number of studies is growing, questions are being raised about their methodological quality. A meta-epidemiological study that investigated the quality of randomized controlled studies (RCT) of COVID-19 pointed out the poor methodology [6]. A preliminary meta-epidemiological study including 18 COVID-19 systematic reviews (SRs) published until March 2020 also pointed out the poor methodology [7]. Despite the important role of SRs in clinical decision-making [8], the quality of COVID-19 SRs performed for speedy reporting has not been adequately evaluated. In addition, no studies assessed the quality of SRs published in preprint servers without peer review in comparison with other data sources. Hence, we investigated the methodological quality of COVID-19 reviews indexed in medRxiv, PubMed, and the Cochrane Library.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 2, 2020. .

protocol and study design
This is a cross-sectional meta-epidemiological study. We abode by the reporting guideline of meta-epidemiological study where applicable (S1 Table) [9]. We published the protocol prior to conducting this study [10].

Types of studies included
We included SRs, indexed in PubMed, medRxiv, and Cochrane Database of Systematic Reviews (CDSR). We included articles of topics related to the COVID-19 practice, irrespective of publication types. We included Cochrane Reviews that dealt with the COVID-19 pandemic. We excluded study protocols.
The definition of SRs was "a scientific investigation that focuses on a specific question and uses explicit, prespecified scientific methods to identify, select, assess, and summarize the findings of similar but separate studies." [11] 2.3 Search methods . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.28.20184077 doi: medRxiv preprint We retrieved abstracts from medRxiv COVID-19 SARS-CoV-2 preprints using search terms: "review", "evidence synthesis", "meta-analysis", or "metaanalysis" on 15 th June 2020. [12] We retrieved abstracts from PubMed using Shokraneh's filter for COVID-19 [13] and PubMed Systematic Reviews Filter [14] on 15 th June 2020. We retrieved abstracts from Cochrane Database of Systematic Reviews (CDSR) using search term "COVID-19" on 17 th Aug 2020.

Study selection
Two of three review authors (YK, SO, and TA) selected abstracts from search results independently.
Disagreements were solved through discussion. Two of three review authors (YK, SO, and TA) selected full text articles from selected abstracts independently. Disagreements were solved through discussion. Of the articles indexed in medRxiv and PubMed and meeting the eligibility criteria, we randomly selected 100 from each database for inclusion in the present study. We included all Cochrane reviews. We defined the methodological quality as "to what extent a study was designed, conducted, analyzed, interpreted, and reported to avoid systematic errors" [15].
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.28.20184077 doi: medRxiv preprint For calibration training, three review authors (YK, SO, and TA) independently evaluated the methodological quality using A MeaSurement Tool to Assess systematic Reviews (AMSTAR) checklists for five included articles [16]. Disagreements were resolved through discussion. Then one of three review authors (YK, SO, and TA) evaluated other articles. Another author (YK or SO) confirmed the results. We resolved disagreements through discussion.
We initially intended to use AMSTR 2, which is a critical appraisal tool for systematic reviews that include randomized or non-randomized studies of healthcare interventions [17]. Because there were few intervention reviews, we used AMSTAR due to the latter's applicability [14]. AMSTAR was developed for assessing the methodological quality of systematic reviews. For each of the 11 items in AMSTAR checklist, we calculated the AMSTAR score by counting the number of "Yes". Higher scores indicate higher quality of the systematic review. The possible maximum score was 11. We added some explanations to AMSTAR to reduce disagreements following a previous study after calibration training [18]. The details are shown in S2 Table.

Other variables
The number of included studies was counted at the time of their synthesis of each article. We evaluated types of research questions, the presence of protocol registrations, and the presence of limitation in each abstract. We determined that articles with "rapid" in the title is a "rapid review".
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. .

0
We also evaluated the presence of SPIN in the title or abstract conclusion in intervention systematic reviews whose first outcome was non-significant. [19] SPIN was judged present when there were the manipulation of language to potentially mislead readers from the likely truth of the results. One of three review authors (YK, SO, and TA) evaluated these variables. Another author (YK or SO) confirmed the results. We resolved the disagreements through discussion.

Outcomes
Our primary outcome was the total score of AMSTAR.

Data analysis
We described summary statistics. We used risk difference (RD) and 95% confidence intervals (CI) to compare binary variables. For quantification of the disagreements of AMSTAR checklists, we calculated kappa statistics between the initial and the final evaluations for the included studies except for the five articles used for calibration. We used ANOVA with Bonferroni correction for the comparison of total AMSTAR score. We conducted sensitivity analysis excluding scoping reviews or qualitative synthesis, because the summary score will be lower in those not intended for meta-analysis or risk-of-bias assessment. We used Stata ver. 16 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. .

Ethics
Because this study used only open data, ethics approval was not applicable.

Results of the search
The details of selection process are shown in Figure 1. We searched a total of 641 abstracts from medRxiv and PubMed. We randomly included 49 articles from medRxiv, and 51 articles from PubMed, respectively. We searched a total of 12 abstracts from CDSR, and included 9 articles.
Detailed citations are shown in S3 Table. Included articles from PubMed was not indexed in medRxiv, or CDSR.

Characteristics of included articles
The characteristics of included articles are shown in Table 1. We included 7 rapid reviews from medRxiv, 4 from PubMed, and 4 from Cochrane, respectively. The medians [interquartile range (IQR)] of included studies for qualitative synthesis were 29.5 [17-45.5] in medRxiv, 18.5  in PubMed, and 24  in Cochrane, respectively. More than a half of articles did not register their protocols.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Quality of included articles
to 88), respectively. Compared with Cochrane reviews, the mean scores of the reviews in medRxiv were lower (mean difference -2.85, 95%confidence intervals (CI): -0.96 to -4.74) and those in PubMed was also lower (mean difference -3.28, 95% CI: -1.40 to -5.15) ( There were six intervention articles whose first presented outcome was not statistically significant.
No articles expressed SPIN in both titles and conclusions of abstracts.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. .

Discussion
This is the first meta-epidemiological study evaluating methodological quality of SRs dealing with the COVID-19 pandemic including preprints, peer-reviewed articles, and Cochrane reviews.
Compared with Cochrane reviews, the mean scores of AMSTAR were lower in articles from medRxiv and PubMed, respectively. The mean scores of medRxiv and PubMed did not differ significantly.
Our findings suggest ordinary peer reviews could not improve the quality of SRs. The mean AMSTAR score difference between medRxiv (6.06) and PubMed (6.48) were not statistically significant. A previous study included SRs published in PubMed from China and US showed their mean AMSTAR score was 6.14 [20]. There's not much difference between our results and regular peer-reviewed SRs. This fact suggests it's not clear whether the quality of COVID-19 SRs will improve in the future with sufficient time of peer-review. At this point, readers should note that the methodological quality of SRs about COVID-19 in both PubMed and medRxiv may be of poor quality, but this is not the case in the Cochrane COVID reviews.
Strict structured reporting like Cochrane reviews would be more useful than time-limited peer reviews or without peer reviews [21]. Referring to limitations in each abstract were more often in Cochrane than in medRxiv or PubMed. The differences of the quality of the articles we found between Cochrane reviews and others included the presentation of protocols, listing included and . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. Some SRs published their protocols in preprint servers [28,29]. For speedy and assuring the scientific quality, protocol registration in preprint servers would be useful.
Our study has several limitations. First, AMSTAR is a reliable and valid measurement tool that has been used widely so far [30]. However, it was originally developed for SRs of randomized trials [31], and there is a criticism for the integrated evaluation of different domains [32]. Future research is needed to determine how to assess the quality of SRs which do not target at interventions. Second, we assessed the quality of the articles while not being masked about the published journal names.
While the empirical evidence shows that such a bias is unlikely [33], this information bias could . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. .

5
have led to overestimating the quality of Cochrane COVID reviews. Third, we were not able to incorporate studies that were published before and after peer review in the current study. Further study to compare studies published in peer-reviewed journals after publication in preprint servers is warranted.

Conclusions
Readers should pay attention to the potentially low methodological quality of SRs about COVID-19 in both PubMed and medRxiv but less so in Cochrane COVID reviews. The methodological quality of COVID-19 SRs except for Cochrane COVID reviews needed to be improved.

Author Contributions:
Yuki Kataoka had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: All authors.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Yuki Kataoka.
Critical revision of the manuscript for important intellectual content: All authors.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. .

6
Funding: self-funding . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.28.20184077 doi: medRxiv preprint 1 7   We calculated kappa statistics for the initial evaluation and final evaluation. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.28.20184077 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 2, 2020. . https://doi.org/10.1101/2020.08.28.20184077 doi: medRxiv preprint