Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Heterogeneity in COVID-19 severity patterns among age-gender groups: an analysis of 778 692 Mexican patients through a meta-clustering technique

View ORCID ProfileLexin Zhou, Nekane Romero, Juan Martínez-Miranda, View ORCID ProfileJ Alberto Conejero, Juan M García-Gómez, View ORCID ProfileCarlos Sáez
doi: https://doi.org/10.1101/2021.02.21.21252132
Lexin Zhou
aBiomedical Data Science Lab, Instituto Universitario de Tecnologías de la Información y Comunicaciones (ITACA), Universitat Politècnica de València (UPV), Camino de Vera s/n, Valencia 46022, España
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lexin Zhou
Nekane Romero
aBiomedical Data Science Lab, Instituto Universitario de Tecnologías de la Información y Comunicaciones (ITACA), Universitat Politècnica de València (UPV), Camino de Vera s/n, Valencia 46022, España
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Juan Martínez-Miranda
cCONACyT - Centro de Investigación Científica y de Educación Superior de Ensenada - CICESE-UT3, Mexico
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J Alberto Conejero
bInstituto Universitario de Matemática Pura y Aplicada (IUMPA), Universitat Politécnica de València, Valencia, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for J Alberto Conejero
Juan M García-Gómez
aBiomedical Data Science Lab, Instituto Universitario de Tecnologías de la Información y Comunicaciones (ITACA), Universitat Politècnica de València (UPV), Camino de Vera s/n, Valencia 46022, España
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carlos Sáez
aBiomedical Data Science Lab, Instituto Universitario de Tecnologías de la Información y Comunicaciones (ITACA), Universitat Politècnica de València (UPV), Camino de Vera s/n, Valencia 46022, España
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carlos Sáez
  • For correspondence: carsaesi@upv.es
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Objective To describe COVID-19 subphenotypes regarding severity patterns including prognostic, ICU and morbimortality outcomes, through stratification based on gender and age groups, as described by inter-patient variability patterns in clinical phenotypes and demographic features.

Materials and methods We used the COVID-19 open data from the Mexican Government including patient-level epidemiological and clinical data from 778 692 SARS-CoV-2 patients from January 13, 2020 to September 30, 2020.

Inter-patient variability was analyzed by combining dimensionality reduction and hierarchical clustering methods. We produced cluster analyses for all combinations of gender and age groups (<18, 18-49, 50-64, and >64). For each group, the optimum number of clusters was selected combining a quantitative approach using the Silhouette coefficient, and a qualitative approach through a subgroup expert inspection via visual analytics. Using the features of the resultant age-gender clusters, we performed a meta-clustering analysis to provide an overall description of the population.

Results We observed a total of 56 age-gender clusters, grouped in 11 clinically distinguishable meta-clusters with different outcomes. Meta-clusters 1 to 3 showed the highest recovery rates (90.27-95.22%). These clusters include: (1) healthy patients of all ages, (2) children with comorbidities who had priority in medical resources, (3) young patients with obesity and smoking habit. Meta-clusters 4 and 5 showed moderate recovery rates (81.3-82.81%): (4) patients with hypertension or diabetes of all ages, (5) typical obese patients with three highly correlated conditions, namely, pneumonia, hypertension and diabetes. Meta-clusters 6 to 11 had very low recovery rates (53.96-66.94%) which include: (6) immunosuppressed patients with the highest comorbidity rate in many diseases, (7) CKD patients with the worse survival length and recovery, (8) elderly smoker with mild COPD, (9) severe diabetic elderly with hypertension, (10, 11) oldest obese smokers with severe COPD and mild cardiovascular disease with the latter (11) showing a relatively higher age and smoke rate, severe COPD and shorter survival length, reinforcing a high correlation between smoking habit and COPD among elderly. Additionally, the source Mexican state and type of clinical institution proved to be an important factor for heterogeneity in severity.

Discussion The proposed unsupervised learning approach successfully uncovered discriminative COVID-19 severity patterns for both genders and all age groups from clinical phenotypes and demographic features. A careful read of group outcomes showed consistent results regarding recent literature. Regarding the Mexican population, our results suggest that habits and comorbidities may play a key role in predicting mortality in older patients. Centenarians tended to fall in the groups with better outcomes repeatedly. Additionally, immunosuppression was not found as a relevant factor for severity alone but did when present along with chronic kidney disease. Further useful correlations could be found by evaluating the duration of unhealthy habits, demographic features, comorbidities, the time since diagnosis, recovery progress, readmission record, and the effect of source variability.

Conclusion The resultant eleven meta-clusters provide bases to comprehend the classification of patients with COVID-19 based on comorbidities, habits, demographic characteristics, geographic data and type of clinical institutions, as well as revealing the correlations between the above characteristics thereby help to anticipate the possible clinical outcomes for every specifically characterized patient. These subphenotypes can establish target groups for automated stratification or triage systems to provide personalized therapies or treatments.

Code available at: https://github.com/bdslab-upv/covid19-metaclustering

Dynamic results visualization at: http://covid19sdetool.upv.es/?tab=mexicoGov

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported by Universitat Politècnica de València contract no. UPV-SUB.2-1302 and FONDO SUPERA COVID-19 by CRUE-Santander Bank grant: Severity Subgroup Discovery and Classification on COVID-19 Real World Data through Machine Learning and Data Quality assessment (SUBCOVERWD-19).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Using Open Data from the Government of Mexico, terms available at: https://datos.gob.mx/libreusomx

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

The studied sample is available in our GitHub repository.

https://github.com/bdslab-upv/covid19-metaclustering

  • Abbreviations

    COPD
    Chronic Obstructive Pulmonary Disease
    CKD
    Chronic Kidney Disease
    INMUSUPR
    Immunosuppression
    ICU
    Intensive Care Unit
    EHR
    Electronic Health Record
    RR
    Recovery Rate
    MC
    Meta-Cluster
    DIF
    National System for Integral Family Development
    IMSS
    Mexican Institute of Social Security
    ISSSTE
    Institute for Social Security and Services for State Workers
    PEMEX
    Mexican Petroleum Institution
    SEDENA
    Secretariat of the National Defense
    SEMAR
    Secretariat of the Navy
    SSA
    Secretariat of Health
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
    Back to top
    PreviousNext
    Posted February 23, 2021.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Heterogeneity in COVID-19 severity patterns among age-gender groups: an analysis of 778 692 Mexican patients through a meta-clustering technique
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Heterogeneity in COVID-19 severity patterns among age-gender groups: an analysis of 778 692 Mexican patients through a meta-clustering technique
    Lexin Zhou, Nekane Romero, Juan Martínez-Miranda, J Alberto Conejero, Juan M García-Gómez, Carlos Sáez
    medRxiv 2021.02.21.21252132; doi: https://doi.org/10.1101/2021.02.21.21252132
    Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
    Citation Tools
    Heterogeneity in COVID-19 severity patterns among age-gender groups: an analysis of 778 692 Mexican patients through a meta-clustering technique
    Lexin Zhou, Nekane Romero, Juan Martínez-Miranda, J Alberto Conejero, Juan M García-Gómez, Carlos Sáez
    medRxiv 2021.02.21.21252132; doi: https://doi.org/10.1101/2021.02.21.21252132

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Epidemiology
    Subject Areas
    All Articles
    • Addiction Medicine (160)
    • Allergy and Immunology (412)
    • Anesthesia (90)
    • Cardiovascular Medicine (855)
    • Dentistry and Oral Medicine (156)
    • Dermatology (97)
    • Emergency Medicine (247)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (392)
    • Epidemiology (8534)
    • Forensic Medicine (4)
    • Gastroenterology (381)
    • Genetic and Genomic Medicine (1739)
    • Geriatric Medicine (167)
    • Health Economics (370)
    • Health Informatics (1234)
    • Health Policy (618)
    • Health Systems and Quality Improvement (467)
    • Hematology (196)
    • HIV/AIDS (369)
    • Infectious Diseases (except HIV/AIDS) (10271)
    • Intensive Care and Critical Care Medicine (552)
    • Medical Education (192)
    • Medical Ethics (51)
    • Nephrology (210)
    • Neurology (1666)
    • Nursing (97)
    • Nutrition (247)
    • Obstetrics and Gynecology (325)
    • Occupational and Environmental Health (450)
    • Oncology (925)
    • Ophthalmology (262)
    • Orthopedics (100)
    • Otolaryngology (172)
    • Pain Medicine (110)
    • Palliative Medicine (40)
    • Pathology (249)
    • Pediatrics (534)
    • Pharmacology and Therapeutics (246)
    • Primary Care Research (205)
    • Psychiatry and Clinical Psychology (1757)
    • Public and Global Health (3826)
    • Radiology and Imaging (622)
    • Rehabilitation Medicine and Physical Therapy (317)
    • Respiratory Medicine (518)
    • Rheumatology (207)
    • Sexual and Reproductive Health (164)
    • Sports Medicine (156)
    • Surgery (190)
    • Toxicology (36)
    • Transplantation (100)
    • Urology (74)