TY - JOUR T1 - Assessing Global Covid-19 Cases Data through Compositional Data Analysis(CoDa) JF - medRxiv DO - 10.1101/2020.12.17.20248424 SP - 2020.12.17.20248424 AU - Luis P.V. Braga AU - Dina Feigenbaum Y1 - 2020/01/01 UR - http://medrxiv.org/content/early/2020/12/19/2020.12.17.20248424.abstract N2 - Background Covid-19 cases data pose an enormous challenge to any analysis. The evaluation of such a global pandemic requires matching reports that follow different procedures and even overcoming some countries’ censorship that restricts publications.Methods This work proposes a methodology that could assist future studies. Compositional Data Analysis (CoDa) is proposed as the proper approach as Covid-19 cases data is compositional in nature. Under this methodology, for each country three attributes were selected: cumulative number of deaths (D); cumulative number of recovered patients(R); present number of patients (A).Results After the operation called closure, with c=1, a ternary diagram and Log-Ratio plots, as well as, compositional statistics are presented. Cluster analysis is then applied, splitting the countries into discrete groups.Conclusions This methodology can also be applied to other data sets such as countries, cities, provinces or districts in order to help authorities and governmental agencies to improve their actions to fight against a pandemic.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialIn this study the data sources are WHO, CDC, ECDC, NHC, DXY, 1point3acres, Worldometers.info, BNO, the COVID Tracking Project (testing and hospitalizations), State and National Government Health Departments, and local media reports. A layer in the package ArcGis10 was created and maintained by the Center for Systems Science and Engineering (CSSE) at the Johns Hopkins University (CSSE 2020). This feature layer is supported by ESRI Living Atlas team, JHU APL and JHU Data Services. This layer is opened to the public and free to share. The cases dataset was downloaded from that repository on the 6th of September, 2020, and includes the following attributes: Country Name, Deaths(D), Recovered(R) and Active(A) patients. Funding StatementNo particular funding was used.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The data are available from legal open source data as already mentionedAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesIn this study the data sources are WHO, CDC, ECDC, NHC, DXY,1point3acres, Worldometers.info, BNO, the COVID Tracking Project (testing and hospitalizations), State and National Government Health Departments, and local media reports. A layer in the package ArcGis10 was created and maintained by the Center for Systems Science and Engineering (CSSE) at the Johns Hopkins University (CSSE 2020). This feature layer is supported by ESRI Living Atlas team, JHU APL and JHU Data Services. This layer is opened to the public and free to share. The cases dataset was downloaded from that repository on the 6th of September, 2020, and includes the following attributes: Country Name, Deaths(D), Recovered(R) and Active(A) patients. Note that the second and the third are cumulative figures until that day and the last one is the value available on that day. The raw cases data are displayed in Appendix I and the closure (acomp scale) in Appendix II, both are available at http://dx.doi.org/10.17632/wt7nd5jv6s.1 ER -