TY - JOUR T1 - Unsupervised Discovery of Risk Profiles on Negative and Positive COVID-19 Hospitalized Patients JF - medRxiv DO - 10.1101/2020.12.30.20248908 SP - 2020.12.30.20248908 AU - Fahimeh Nezhadmoghadam AU - Jose Tamez-Peña Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/01/04/2020.12.30.20248908.abstract N2 - COVID-19 is a viral disease that affects people in different ways: Most people will develop mild symptoms; others will require hospitalization, and a few others will die. Hence identifying risk factors is vital to assist physicians in the treatment decision. The objective of this paper is to determine whether unsupervised analysis of risk factors of positive and negative COVID-19 subjects may be useful for the discovery of a small set of reliable and clinically relevant risk-profiles. We selected 13367 positive and 19958 negative hospitalized patients from the Mexican Open Registry. Registry patients were described by 13 risk factors, three different outcomes, and COVID-19 test results. Hence, the dataset could be described by 6144 different risk-profiles per age group. To discover the most common risk-profiles, we propose the use of unsupervised learning. The data was split into discovery (70%) and validation (30%) sets. The discovery set was analyzed using the partition around medoids (PAM) method and robust consensus clustering was used to estimate the stable set of risk-profiles. We validated the reliability of the PAM models by predicting the risk-profile of the validation set subjects. The clinical relevance of the risk-profiles was evaluated on the validation set by characterizing the prevalence of the three patient outcomes: pneumonia diagnosis, ICU, or death. The analysis discovered six positives and five negative COVID-19 risk-profiles with strong statistical differences among them. Henceforth PAM clustering with consensus mapping is a viable method for unsupervised risk-profile discovery among subjects with critical respiratory health issues.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialOur study was not a clinical trialFunding StatementThis research was supported with funding from the Mexican National Council for Science and Technology (CONACYT).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Our work did not require ethical oversight.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.Yeshttps://github.com/FahimehN/COVID-19-Risk-Profiles-Discovering https://github.com/FahimehN/COVID-19-Risk-Profiles-Discovering ER -