Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Unsupervised Learning for Large Scale Data: The ATHLOS Project

Petros Barmpas, Sotiris Tasoulis, Aristidis G. Vrahatis, Panagiotis Anagnostou, Spiros Georgakopoulos, Matthew Prina, José Luis Ayuso-Mateos, Jerome Bickenbach, Ivet Bayes, Martin Bobak, Francisco Félix Caballero, Somnath Chatterji, Laia Egea-Cortés, Esther García-Esquinas, Matilde Leonardi, Seppo Koskinen, Ilona Koupil, Andrzej Pająk, Martin Prince, Warren Sanderson, Sergei Scherbov, Abdonas Tamosiunas, Aleksander Galas, Josep MariaHaro, Albert Sanchez-Niubo, Vassilis P. Plagianakos, Demosthenes Panagiotakos
doi: https://doi.org/10.1101/2021.04.01.21254751
Petros Barmpas
aDepartment of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: petrosbarmpas@uth.gr
Sotiris Tasoulis
aDepartment of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aristidis G. Vrahatis
aDepartment of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Panagiotis Anagnostou
aDepartment of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Spiros Georgakopoulos
aDepartment of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matthew Prina
bSocial Epidemiology Research Group. Health Service and Population Research Department, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, UK
cGlobal Health Institute, King’s College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
José Luis Ayuso-Mateos
dCentro de Investigación Biomédica en Red de Salud Mental, CIBERSAM, Madrid, Spain
eDepartment of Psychiatry, Universidad Autónoma de Madrid, Madrid, Spain
fHospital Universitario de La Princesa, Instituto de Investigación Sanitaria Princesa (IIS Princesa), Madrid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jerome Bickenbach
gSwiss Paraplegic Research, Guido A. Zäch Institute (GZI), Nottwil, Switzerland
hDepartment of Health Sciences & Health Policy, University of Lucerne, Lucerne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ivet Bayes
mResearch, Innovation and Teaching Unit. Parc Sanitari Sant Joan de Déu, Sant Boi de Llobregat, Spain
dCentro de Investigación Biomédica en Red de Salud Mental, CIBERSAM, Madrid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Martin Bobak
iDepartment of Epidemiology and Public Health, University College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Francisco Félix Caballero
jDepartment Preventive Medicine and Public Health, Universidad Autónoma de Madrid/Idipaz, Madrid, Spain
kCentro de Investigación Biomédica en Red de Epidemiología y Salud Pública, CIBERESP, Madrid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Somnath Chatterji
lInformation, Evidence and Research, World Health Organization, Geneva, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laia Egea-Cortés
mResearch, Innovation and Teaching Unit. Parc Sanitari Sant Joan de Déu, Sant Boi de Llobregat, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Esther García-Esquinas
jDepartment Preventive Medicine and Public Health, Universidad Autónoma de Madrid/Idipaz, Madrid, Spain
kCentro de Investigación Biomédica en Red de Epidemiología y Salud Pública, CIBERESP, Madrid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matilde Leonardi
nFondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Seppo Koskinen
oNational Institute for Health and Welfare (THL), Helsinki, Finland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ilona Koupil
pCentre for Health Equity Studies, Department of Public Health Sciences, Stockholm University, Stockholm, Sweden
qDepartment of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrzej Pająk
rDepartment of Epidemiology and Population Studies, Jagienllonian University, Krakow, Poland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Martin Prince
cGlobal Health Institute, King’s College London, London, UK
sCentre for Global Mental Health. Health Service and Population Research Department, Institute of Psychiatry, Psychology& Neuroscience, King’s College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Warren Sanderson
tInternational Institute for Applied Systems Analysis, World Population Program, Wittgenstein Centre for Demography and Global Human Capital, Luxemburg, Austria
uDepartment of Economics, Stony Brook University, Stony Brook, NY, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sergei Scherbov
tInternational Institute for Applied Systems Analysis, World Population Program, Wittgenstein Centre for Demography and Global Human Capital, Luxemburg, Austria
vAustrian Academy of Science, Vienna Institute of Demography, Vienna, Austria
wRussian Presidential Academy of National Economy and Public Administration (RANEPA), Moscow, Russian Federation
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Abdonas Tamosiunas
xLithuanian University of Health Sciences, Kaunas, Lithuania
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aleksander Galas
yDepartment of Epidemiology and Preventive Medicine, Jagiellonian University, Krakow, Poland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Josep MariaHaro
mResearch, Innovation and Teaching Unit. Parc Sanitari Sant Joan de Déu, Sant Boi de Llobregat, Spain
dCentro de Investigación Biomédica en Red de Salud Mental, CIBERSAM, Madrid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Albert Sanchez-Niubo
mResearch, Innovation and Teaching Unit. Parc Sanitari Sant Joan de Déu, Sant Boi de Llobregat, Spain
dCentro de Investigación Biomédica en Red de Salud Mental, CIBERSAM, Madrid, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vassilis P. Plagianakos
aDepartment of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Demosthenes Panagiotakos
zDepartment of Nutrition and Dietetics, School of Health Science and Education, Harokopio University, Athens, Greece
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

1 Abstract

Recent technological advancements in various domains, such as the biomedical and health, offer a plethora of big data for analysis. Part of this data pool is the experimental studies that record various and several features for each instance. It creates datasets having very high dimensionality with mixed data types, with both numerical and categorical variables. On the other hand, unsupervised learning has shown to be able to assist in high-dimensional data, allowing the discovery of unknown patterns through clustering, visualization, dimensionality reduction, and in some cases, their combination. This work highlights unsupervised learning methodologies for large-scale, high-dimensional data, providing the potential of a unified framework that combines the knowledge retrieved from clustering and visualization. The main purpose is to uncover hidden patterns in a high-dimensional mixed dataset, which we achieve through our application in a complex, real-world dataset. The experimental analysis indicates the existence of notable information exposing the usefulness of the utilized methodological framework for similar high-dimensional and mixed, real-world applications.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work is supported by the ATHLOS (Aging Trajectories of Health: Longitudinal Opportunities and Synergies) project, funded by the European Union's Horizon 2020 Research and Innovation Program under grant agreement number 635316.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Does not apply in our work

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Data sharing is not applicable to this article

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted April 06, 2021.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Unsupervised Learning for Large Scale Data: The ATHLOS Project
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Unsupervised Learning for Large Scale Data: The ATHLOS Project
Petros Barmpas, Sotiris Tasoulis, Aristidis G. Vrahatis, Panagiotis Anagnostou, Spiros Georgakopoulos, Matthew Prina, José Luis Ayuso-Mateos, Jerome Bickenbach, Ivet Bayes, Martin Bobak, Francisco Félix Caballero, Somnath Chatterji, Laia Egea-Cortés, Esther García-Esquinas, Matilde Leonardi, Seppo Koskinen, Ilona Koupil, Andrzej Pająk, Martin Prince, Warren Sanderson, Sergei Scherbov, Abdonas Tamosiunas, Aleksander Galas, Josep MariaHaro, Albert Sanchez-Niubo, Vassilis P. Plagianakos, Demosthenes Panagiotakos
medRxiv 2021.04.01.21254751; doi: https://doi.org/10.1101/2021.04.01.21254751
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Unsupervised Learning for Large Scale Data: The ATHLOS Project
Petros Barmpas, Sotiris Tasoulis, Aristidis G. Vrahatis, Panagiotis Anagnostou, Spiros Georgakopoulos, Matthew Prina, José Luis Ayuso-Mateos, Jerome Bickenbach, Ivet Bayes, Martin Bobak, Francisco Félix Caballero, Somnath Chatterji, Laia Egea-Cortés, Esther García-Esquinas, Matilde Leonardi, Seppo Koskinen, Ilona Koupil, Andrzej Pająk, Martin Prince, Warren Sanderson, Sergei Scherbov, Abdonas Tamosiunas, Aleksander Galas, Josep MariaHaro, Albert Sanchez-Niubo, Vassilis P. Plagianakos, Demosthenes Panagiotakos
medRxiv 2021.04.01.21254751; doi: https://doi.org/10.1101/2021.04.01.21254751

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (76)
  • Allergy and Immunology (196)
  • Anesthesia (54)
  • Cardiovascular Medicine (489)
  • Dentistry and Oral Medicine (90)
  • Dermatology (56)
  • Emergency Medicine (168)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (211)
  • Epidemiology (5689)
  • Forensic Medicine (3)
  • Gastroenterology (216)
  • Genetic and Genomic Medicine (868)
  • Geriatric Medicine (88)
  • Health Economics (232)
  • Health Informatics (762)
  • Health Policy (392)
  • Health Systems and Quality Improvement (254)
  • Hematology (105)
  • HIV/AIDS (182)
  • Infectious Diseases (except HIV/AIDS) (6488)
  • Intensive Care and Critical Care Medicine (391)
  • Medical Education (117)
  • Medical Ethics (28)
  • Nephrology (93)
  • Neurology (850)
  • Nursing (44)
  • Nutrition (141)
  • Obstetrics and Gynecology (163)
  • Occupational and Environmental Health (262)
  • Oncology (514)
  • Ophthalmology (163)
  • Orthopedics (44)
  • Otolaryngology (107)
  • Pain Medicine (48)
  • Palliative Medicine (21)
  • Pathology (149)
  • Pediatrics (250)
  • Pharmacology and Therapeutics (146)
  • Primary Care Research (114)
  • Psychiatry and Clinical Psychology (969)
  • Public and Global Health (2231)
  • Radiology and Imaging (377)
  • Rehabilitation Medicine and Physical Therapy (174)
  • Respiratory Medicine (312)
  • Rheumatology (109)
  • Sexual and Reproductive Health (81)
  • Sports Medicine (82)
  • Surgery (118)
  • Toxicology (25)
  • Transplantation (34)
  • Urology (42)