PT - JOURNAL ARTICLE AU - George D. Vavougios AU - Christoforos Konstantatos AU - Pavlos-Christoforos Sinigalias AU - Sotirios G. Zarogiannis AU - Konstantinos Kolomvatsos AU - George Stamoulis AU - Konstantinos I. Gourgoulianis TI - Data driven phenotyping and COVID-19 case definitions: a pattern recognition approach AID - 10.1101/2021.04.30.21256219 DP - 2021 Jan 01 TA - medRxiv PG - 2021.04.30.21256219 4099 - http://medrxiv.org/content/early/2021/05/03/2021.04.30.21256219.short 4100 - http://medrxiv.org/content/early/2021/05/03/2021.04.30.21256219.full AB - Introduction COVID-19 has pathological pulmonary as well as several extrapulmonary manifestations and thus many different symptoms may arise in patients. The aim of our study was to determine COVID-19 syndromic phenotypes in a data driven manner using survey results extracted from Carnegie Mellon University’s Delphi Group.Methods Monthly survey results (>1 million responders per month; 320.326 responders with positive COVID-19 test and disease duration <30 days were included in this study) were used sequentially in identifying and validating COVID-19 syndromic phenotypes. Logistic Regression Weighted Multiple Correspondence Analysis (LRW-MCA) was used as a preprocessing procedure, in order to weight and transform symptoms recorded by the survey to eigenspace coordinates (i.e. object scores per case / dimension), with a goal of capturing a total variance of > 75%. These scores along with symptom duration were subsequently used by the Two Step Clustering algorithm to produce symptom clusters. Post-hoc logistic regression models adjusting for age, gender and comorbidities and confirmatory linear principal components analyses were used to further explore the data. The model created from 66.165 included responders in August, was subsequently validated in data from March – December 2020.Results Five validated COVID-19 syndromes were identified in August: 1. Afebrile (0%), Non-Coughing (0%), Oligosymptomatic (ANCOS) 2. Febrile (100%) Multisymptomatic (FMS) 3. Afebrile (0%) Coughing (100%) Oligosymptomatic (ACOS), 4. Oligosymptomatic with additional self-described symptoms (100%; OSDS) and 5. Olfaction / Gustatory Impairment Predominant (100%; OGIP).Discussion We present 5 distinct symptom phenotypes within the COVID-19 spectrum that remain stable within 9 – 12 days of first symptom onset. The typical febrile respiratory phenotype is presented as a minority among identified syndromes, a finding that may impact both epidemiological surveillance norms and transmission dynamics.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNo funding to be reported.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Deidentified data were provided by Carnegie Mellon University to the University of Thessaly via a project-based collaboration between the two institutions, consolidated and outlined by a Research Data Use Agreement. The Research Data Use Agreement was made and entered into as of August 21, 2020 by and between Carnegie Mellon University, a Pennsylvania non-profit corporationthe Department of Respiratory Medicine, Faculty of Medicine, University of Thessaly, a Non-profit organization / University having its principal place of business at Biopolis, P.C. 41500 Larissa, Greece. The Carnegie Mellon University (CMU) CMU Institutional Review Board approved the original survey protocol and instrument (IRB Approval Registration Number: STUDY2020_00000162) All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe analyses and all related files are available upon request. https://cmu-delphi.github.io/delphi-epidata/symptom-survey/