PT - JOURNAL ARTICLE AU - Costa-Santos, Cristina AU - Luísa Neves, Ana AU - Correia, Ricardo AU - Santos, Paulo AU - Monteiro-Soares, Matilde AU - Freitas, Alberto AU - Ribeiro-Vaz, Inês AU - Henriques, Teresa AU - Rodrigues, Pedro Pereira AU - Costa-Pereira, Altamiro AU - Pereira, Ana Margarida AU - Fonseca, João TI - COVID-19 surveillance - a descriptive study on data quality issues AID - 10.1101/2020.11.03.20225565 DP - 2020 Jan 01 TA - medRxiv PG - 2020.11.03.20225565 4099 - http://medrxiv.org/content/early/2020/11/05/2020.11.03.20225565.short 4100 - http://medrxiv.org/content/early/2020/11/05/2020.11.03.20225565.full AB - Background High-quality data is crucial for guiding decision making and practicing evidence-based healthcare, especially if previous knowledge is lacking. Nevertheless, data quality frailties have been exposed worldwide during the current COVID-19 pandemic. Focusing on a major Portuguese surveillance dataset, our study aims to assess data quality issues and suggest possible solutions.Methods On April 27th 2020, the Portuguese Directorate-General of Health (DGS) made available a dataset (DGSApril) for researchers, upon request. On August 4th, an updated dataset (DGSAugust) was also obtained. The quality of data was assessed through analysis of data completeness and consistency between both datasets.Results DGSAugust has not followed the data format and variables as DGSApril and a significant number of missing data and inconsistencies were found (e.g. 4,075 cases from the DGSApril were apparently not included in DGSAugust). Several variables also showed a low degree of completeness and/or changed their values from one dataset to another (e.g. the variable ‘underlying conditions’ had more than half of cases showing different information between datasets). There were also significant inconsistencies between the number of cases and deaths due to COVID-19 shown in DGSAugust and by the DGS reports publicly provided daily.Conclusions The low quality of COVID-19 surveillance datasets limits its usability to inform good decisions and perform useful research. Major improvements in surveillance datasets are therefore urgently needed - e.g. simplification of data entry processes, constant monitoring of data, and increased training and awareness of health care providers - as low data quality may lead to a deficient pandemic control.Competing Interest StatementThe authors have declared no competing interest.Funding Statementno fundingAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:not applicableAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData used in this work was made available by the Portuguese Directorate-General of Health, under the scope of article 39th of the decree law 2-B/2020, from April the 2nd and is available from request.