PT - JOURNAL ARTICLE AU - Lexin Zhou AU - Nekane Romero AU - Juan Martínez-Miranda AU - J Alberto Conejero AU - Juan M García-Gómez AU - Carlos Sáez TI - Heterogeneity in COVID-19 severity patterns among age-gender groups: an analysis of 778 692 Mexican patients through a meta-clustering technique AID - 10.1101/2021.02.21.21252132 DP - 2021 Jan 01 TA - medRxiv PG - 2021.02.21.21252132 4099 - http://medrxiv.org/content/early/2021/03/03/2021.02.21.21252132.short 4100 - http://medrxiv.org/content/early/2021/03/03/2021.02.21.21252132.full AB - We describe age-gender unbiased COVID-19 subphenotypes regarding severity patterns including prognostic, ICU and morbimortality outcomes, from patterns in clinical phenotypes, habits and demographic features. We used the Mexican Government COVID-19 open data including 778692 SARS-CoV-2 patient-level data as of September 2020. We applied a two-stage clustering approach combining dimensionality reduction and hierarchical clustering: 56 clusters from independent age-gender analyses supported 11 clinically distinguishable meta-clusters (MCs). MCs 1-3 showed high recovery rates (90.27-95.22%), including healthy patients of all ages, children with comorbidities with priority in medical resources, and young obese, smoker patients. MCs 4-5 showed moderate recovery rates (81.3-82.81%): patients with hypertension or diabetes of all ages, and obese patients with pneumonia, hypertension and diabetes. MCs 6-11 showed low recovery rates (53.96-66.94%): immunosuppressed patients with high comorbidity rate, CKD patients with poor survival and recovery, elderly smokers with COPD, severe diabetic elderly with hypertension, and oldest obese smokers with COPD and mild cardiovascular disease. Group outcomes conformed to the recent literature on dedicated age-gender groups. Combination of unhealthy habits and comorbidities were associated with mortality in older patients. Centenarians tended to better outcomes. Immunosuppression was not found as a relevant factor for severity alone but did when present along with CKD. Mexican states and type of clinical institution revealed relevant heterogeneity in severity, relevant for consideration in further studies. The resultant eleven MCs provide bases for a deep understanding of the epidemiological and phenotypical severity presentation of COVID-19 patients based on comorbidities, habits, demographic characteristics, and on patient provenance and type of clinical institutions, as well as revealing the correlations between the above characteristics to anticipate the possible clinical outcomes of each patient with a specific profile. These results can establish groups for automated stratification or triage towards personalized treatment enabling a personalized evaluation of the patient’s expected outcomes.Code available at https://github.com/bdslab-upv/covid19-metaclusteringDynamic results visualization at http://covid19sdetool.upv.es/?tab=mexicoGovCompeting Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by Universitat Politècnica de València contract no. UPV-SUB.2-1302 and FONDO SUPERA COVID-19 by CRUE-Santander Bank grant: Severity Subgroup Discovery and Classification on COVID-19 Real World Data through Machine Learning and Data Quality assessment (SUBCOVERWD-19).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Using Open Data from the Government of Mexico, terms available at: https://datos.gob.mx/libreusomxAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe studied sample is available in our GitHub repository. https://github.com/bdslab-upv/covid19-metaclustering AbbreviationsCOPDChronic Obstructive Pulmonary DiseaseCKDChronic Kidney DiseaseINMUSUPRImmunosuppressionICUIntensive Care UnitEHRElectronic Health RecordMLMachine LearningDQData QualityRRRecovery RateMCMeta-ClusterDIFNational System for Integral Family DevelopmentIMSSMexican Institute of Social SecurityISSSTEInstitute for Social Security and Services for State WorkersPEMEXMexican Petroleum InstitutionSEDENASecretariat of the National DefenseSEMARSecretariat of the NavySSASecretariat of HealthTICType of Clinical Institution