Abstract
Data visualization is an essential tool for exploring and communicating findings in medical research, and especially in epidemiological surveillance. The COVID19-Tracker web application systematically produces daily updated data visualization and analysis of the SARS-CoV-2 epidemic in Spain. It collects automatically daily data on COVID-19 diagnosed cases, intensive care unit admissions, and mortality, from February 24th, 2020 onwards. Two applications have already been developed; 1) to analyze data trends and estimating short-term projections; 2) to estimate the case fatality rate, and; 3) To assess the effect of the lockdown measures on the trends of incident data. The application may help for a better understanding of the SARS-CoV-2 epidemic data in Spain.
1. INTRODUCTION
The first confirmed cases of SARS-CoV-2 in Spain were identified in late February 2020 (1). Since then, Spain became, by the April 8th, the second most affected country worldwide (148.220 diagnosed cases) and recorded the third number of deaths (14,792 deaths) due to the SARS-CoV-2 pandemic (2). Since March 16th, lockdown measures oriented on flattening the epidemic curve were in place in Spain, restricting social contact, reducing public transport, and closing businesses, except for those essential to the country’s supply chains (3). However, this has not been enough to change the rising trend of the epidemic. For this reason, a more restrictive lockdown was suggested (4), and eventually undertaken by the Spanish Government on March 30th (5).
Data visualization is an essential tool for exploring and communicating findings in medical research, and especially in epidemiological surveillance. It can help researchers and policymakers to identify and understand trends that could be overlooked if the data were reviewed in tabular form. We have developed a Shiny app that allows users to evaluate daily time-series data from a statistical standpoint. The COVID19-Tracker app systematically produces daily updated data visualization and analysis of SARS-CoV-2 epidemic data in Spain. It is easy to use and fills a role in the tool space for visualization, analysis, and exploration of epidemiological data during this particular scenario.
2. SOFTWARE AVAILABILITY AND REQUIREMENTS
The COVID19-Track app has been developed in RStudio (6), version 1.2.5033, using the Shiny package, version 1.4.0. Shiny offers the ability to develop a graphical user interface (GUI) that can be run locally or deployed online. Last is particularly beneficial to show and communicate updated findings to a broad audience. All the analyses have been carried out using R, version 3.6.3.
The application has a friendly structure based on menus to shown data visualization for each of the analyses currently implemented: projections, fatality rates, and intervention analysis (Figure 1).
Projections and Projections by age display the trends for diagnosed cases, ICU admissions, and mortality since the epidemic began, and estimates a 3-day projection (Figures 2a y 2b, respectively).
Fatality and Fatality by age display the trends for the case fatality rates (Figure 2c).
Intervention displays and calculates the effect of the lockdown periods on the trend of incident daily diagnosed cases, ICU admissions, and mortality (Figure 2d).
We also introduced two additional menus to describe the Methodology, reporting the statistical details on the analyses already implemented, and Other apps, which collects applications also developed in Shiny by other users to follow the COVID19 epidemic in Spain and globally.
The app has an automated process to update data and all analyses every time a user connects to the app. It is available online at the following link: https://ubidi.shinyapps.io/covid19/ and shortly free available on github as an R package. The produced graphs are mouse-sensitive, showing the observed and expected number of events through the plot. Likewise, when selecting any plot, the application allows the option of downloading it as a portable network graphic (* .png). All menus are available in English, Spanish, and Catalan.
3. DATA SOURCES
We collected daily data on COVID-19 diagnosed cases, intensive care unit (ICU) admissions, and mortality, from February 24th onwards. Data is collected automatically every day daily from the Datadista Github repository (7). This repository updates data according to the calendar and rate of publication of the Spanish Ministry of Health/Instituto de Salud Carlos III (8).
Data corresponding to the available number of ICU beds in Spain (year 2017) are also obtained from the Datadista Github repository (7).
4. METHODS
4.1. Projections
To estimate the observed data trends for the accumulated number of events, we used a Poisson regression model (9), allowing for over-dispersion (10), fitting a quadratic effect:
Where t = 1, 2, …, T, represents the time unit (from the first observed day until the last, T consecutive days in total), and ct is the accumulated number of events. The estimated regression parameters and their standard errors are used to obtain the short-term projections, up to three days, and their 95% confidence interval (95% CI).
Results are available nationwide by default, and at the regional level accessing to the dropdown menu for this purpose. Trends and projections are also calculated by age group (0-39, 40-49, 50-59, 60-69, 70-79, and 80 or more years).
We should note that in previous versions of this application, an alternative model was also considered, including only the linear trend. The models were compared using a similarity ratio test. Based on the evolution of the epidemic, we observed that the best fit was provided by the quadratic model, described above, making it the model used in the current version. In any case, the goodness of fit of the models is regularly being evaluated in case a reformulation is necessary that could provide a better fit of the data during the course of the epidemic.
4.2. Case fatality rate
The case fatality rate is defined as the ratio between the number of deaths and the diagnosed cases (11). Thus, an offset is fitted into the Poisson regression model, as the logarithm of the diagnosed cases:
Where mt is the daily number of deaths, and ct is the daily number of diagnosed cases. Case fatality rates are also calculated for the same age groups.
We should acknowledge it is not possible to make an accurate estimate of the case fatality rates due to underreporting of cases diagnosed in official statistics (12). Nonetheless, the estimation and monitoring of the case fatality rates monitoring are of espeical interest in the current epidemic scenario.
4.3. Intervention analysis
To assess the effect of the lockdown on the trend of incident cases, admissions in ICU intensive care units, and mortality, we used an interrupted time-series design (13). The data is analyzed with quasi-Poisson regression with an interaction model to estimate the change in trend:
Where lockdown is a variable that identifies the intervals before and during the lockdown periods imposed by the Spanish Government (3,5) (0=before March 15th, 2020; 1=between March 16th and March 29th, 2020; and 2= after Mach 30th, 2020).
We should acknowledge that this is a descriptive analysis without predictive purposes. For an easy interpretation, and comparison of the effectiveness of lockdown measures between countries, a linear trend is assumed before and after the lockdown (14). Although not accounted for residual autocorrelation, the estimates are unbiased but possibly inefficient. This analysis also shows the results nationwide in table reporting the daily percentage (%) mean increase, and its 95%CI.
5. Further developing
So far, the COVID19-Tracker app has been very well received online, with a large number of connections generating an outsized memory usage on our server (Figure 3).
We keep improving the application by uploading new data visualizations, which may help for a better understanding of the SARS-CoV-2 epidemic data in Spain. Moreover, the COVID19-Tracker app could also be extensible to data visualizations across other countries and geographical regions.
Discussion
The COVID19-Tracker application presents a set of tools for updated analysis and graphic visualization that can be very useful for a better understanding of the evolution of the COVID-19 epidemic in Spain and its epidemiological surveillance.
As limitations, we should be note that the application does not take into account the changes in the definition of a case diagnosed by COVID-19, nor the population exposed. So, the number of events is modeled directly instead of the incidence rate, assuming that the entire population is at risk, except for the case fatality rate. On the other hand, the analyzes are not free from the biases linked to the source of information provided by the Ministry of Health (8), being collected on a daily basis through the Datadista github (7).
We continue to plan improvements to the app to include new analytics and visualizations. Aos, the application could be extensible for use in other countries or geographic areas. In summary, this application, easy to use, come to fill a gap in this particular scenario for the visualization of epidemiological data for the COVID-19 epidemic in Spain.
Data Availability
Data on COVID-19 diagnosed cases, intensive care unit (ICU) admissions and mortality is available from the Datadista github repository. This repository updates data according to the calendar and rate of publication of the Spanish Ministry of Health/Instituto de Salud Carlos III.
https://github.com/datadista/datasets/tree/master/COVID%2019
Funding
None.
Conflict of interest
None.
Acknowledgements
None.