Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

An informatics approach to profiling patient experiences using electronic health records: constructing and clustering the burden space of individuals under 65 years of age with multiple long-term conditions

View ORCID ProfileMozhdeh Shiranirad, Zlatko Zlatev, Roberta Chiovoloni, Emilia Holland, Jakub Dylag, Nisreen A. Alwan, Ann Berrington, Michael Boniface, View ORCID ProfileSimon D. S. Fraser, View ORCID ProfileRebecca B. Hoyle
doi: https://doi.org/10.64898/2025.11.27.25341182
Mozhdeh Shiranirad
1School of Mathematical Sciences, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mozhdeh Shiranirad
  • For correspondence: M.Shiranirad{at}soton.ac.uk
Zlatko Zlatev
2School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Roberta Chiovoloni
3Population Data Science, Swansea University Medical School, Faculty of Medicine, Health and Life Science, Swansea University, Swansea, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Emilia Holland
4School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jakub Dylag
2School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nisreen A. Alwan
4School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
6University Hospital Southampton NHS Foundation Trust, Southampton, United Kingdom
7National Institute for Health Research Applied Research Collaboration Wessex, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ann Berrington
5School of Economic, Social and Political Sciences, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Boniface
2School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Simon D. S. Fraser
4School of Primary Care, Population Sciences and Medical Education, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
6University Hospital Southampton NHS Foundation Trust, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Simon D. S. Fraser
Rebecca B. Hoyle
1School of Mathematical Sciences, University of Southampton, Southampton, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rebecca B. Hoyle
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Living with multiple long-term conditions (MLTC) profoundly impacts patients’ lives, affecting not only their health but also their financial, emotional, and social well-being. It can impose a significant burden on people. Here we take a novel approach, exploring the lived experience of individuals with MLTC by identifying patterns of burden—spanning physical, emotional, social, and financial domains—using machine learning techniques applied to electronic health records (EHR).

We constructed a cohort of 310,990 individuals born between January 1, 1958, and December 31, 1967, all with two or more long-term conditions. Proxy indicators of burden were extracted from EHR data. Using k-means clustering, we identified subgroups of individuals with distinct burden profiles and analyzed the distribution of burden indicators within each cluster.

Several large clusters were characterized by high prevalence of one or more of pain, anxiety, and depression. Most clusters were predominantly female, with females over-represented compared to the overall burden cohort. Socioeconomic disparities were evident: clusters marked by pain had a higher proportion of individuals from the most deprived areas, while clusters characterised by stress or anxiety alone had a higher proportion of those from the least deprived areas. Certain combinations of burden indicators tended to be over-represented in the same clusters, such as pain with mobility problems, and depression with very high A&E arrivals, and separation/divorce.

This study demonstrates the utility of machine learning for uncovering nuanced, patient-centered patterns in the experience of living with MLTC. The clustering approach reveals how different burdens intersect and vary across demographic and socioeconomic lines, offering insights that could inform more tailored and equitable care strategies.

Author summary Although a growing number of people are living with multiple long-term conditions (MLTCs), the nature of the burden faced by individuals and the common patterns of such person-centred burdens remain largely unknown. Previous MLTC studies have often clustered people by their long-term conditions to uncover how these conditions group together in electronic health records (EHRs). However, this approach does not capture the true complexity of MLTCs or their impact on patient experience. In this study, we identified a series of proxy burden indicators, highlighted the challenges of extracting them from EHRs, and developed data-driven methods to uncover important patterns of patient-centred burden within this large, complex space—opening new insights and a fresh research direction for understanding MLTCs. Health systems, policymakers, and clinicians stand to benefit from this study’s findings by gaining clearer insight into the expected challenges faced by different groups living with MLTCs, potentially informing more targeted support, smarter resource allocation, and better care outcomes. Researchers, in turn, benefit from a systematic methodology for clustering patient burden.

Competing Interest Statement

Rebecca B. Hoyle reports a relationship with Smith Institute Ltd that includes: scientific board membership. All other authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding Statement

Yes

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study was conducted in accordance with the UK Policy Framework for Health and Social Care Research. Ethics approval for this study was obtained from the University of Southampton Faculty of Medicine Ethics committee (ERGO II Reference 66810). The SAIL Databank independent Information Governance Review Panel approved this study (SAIL Project: 1377).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

Data may be obtained from a third party and are not publicly available. The data used in this study are available in the SAIL Databank at Swansea University, Swansea, UK. Applications to access data via SAIL can be made following their established process https://saildatabank.com/data/apply-to-work-with-the-data/.

https://saildatabank.com/data/apply-to-work-with-the-data/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted December 02, 2025.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
An informatics approach to profiling patient experiences using electronic health records: constructing and clustering the burden space of individuals under 65 years of age with multiple long-term conditions
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
An informatics approach to profiling patient experiences using electronic health records: constructing and clustering the burden space of individuals under 65 years of age with multiple long-term conditions
Mozhdeh Shiranirad, Zlatko Zlatev, Roberta Chiovoloni, Emilia Holland, Jakub Dylag, Nisreen A. Alwan, Ann Berrington, Michael Boniface, Simon D. S. Fraser, Rebecca B. Hoyle
medRxiv 2025.11.27.25341182; doi: https://doi.org/10.64898/2025.11.27.25341182
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
An informatics approach to profiling patient experiences using electronic health records: constructing and clustering the burden space of individuals under 65 years of age with multiple long-term conditions
Mozhdeh Shiranirad, Zlatko Zlatev, Roberta Chiovoloni, Emilia Holland, Jakub Dylag, Nisreen A. Alwan, Ann Berrington, Michael Boniface, Simon D. S. Fraser, Rebecca B. Hoyle
medRxiv 2025.11.27.25341182; doi: https://doi.org/10.64898/2025.11.27.25341182

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4483)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (615)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15283)
  • Forensic Medicine (31)
  • Gastroenterology (1134)
  • Genetic and Genomic Medicine (6651)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4606)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1624)
  • Hematology (545)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15965)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (675)
  • Neurology (6699)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3370)
  • Ophthalmology (989)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (670)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5497)
  • Public and Global Health (9288)
  • Radiology and Imaging (2225)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1202)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (536)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)