Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Geo-social gradients in predicted COVID-19 prevalence and severity in Great Britain: results from 2,266,235 users of the COVID-19 Symptoms Tracker app

View ORCID ProfileRuth C E Bowyer, View ORCID ProfileThomas Varsavsky, View ORCID ProfileCarole H. Sudre, View ORCID ProfileBenjamin A K Murray, View ORCID ProfileMaxim B Freidin, View ORCID ProfileDarioush Yarand, View ORCID ProfileSajaysurya Ganesh, View ORCID ProfileJoan Capdevila, View ORCID ProfileEllen J Thompson, View ORCID ProfileElco Bakker, View ORCID ProfileM. Jorge Cardoso, View ORCID ProfileRichard Davies, View ORCID ProfileJonathan Wolf, View ORCID ProfileTim D Spector, View ORCID ProfileSebastien Ourselin, View ORCID ProfileClaire J Steves, View ORCID ProfileCristina Menni
doi: https://doi.org/10.1101/2020.04.23.20076521
Ruth C E Bowyer
1Department of Twin Research and Genetic Epidemiology, King’s College London, Westminster Bridge Road, SE17EH London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ruth C E Bowyer
Thomas Varsavsky
2School of Biomedical Engineering & Imaging Sciences, King’s College London, SE1 7EU London UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Thomas Varsavsky
Carole H. Sudre
2School of Biomedical Engineering & Imaging Sciences, King’s College London, SE1 7EU London UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carole H. Sudre
Benjamin A K Murray
2School of Biomedical Engineering & Imaging Sciences, King’s College London, SE1 7EU London UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin A K Murray
Maxim B Freidin
1Department of Twin Research and Genetic Epidemiology, King’s College London, Westminster Bridge Road, SE17EH London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Maxim B Freidin
Darioush Yarand
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Darioush Yarand
Sajaysurya Ganesh
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sajaysurya Ganesh
Joan Capdevila
3Zoe Global Limited, 164 Westminster Bridge Road, London SE1 7RW, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joan Capdevila
Ellen J Thompson
1Department of Twin Research and Genetic Epidemiology, King’s College London, Westminster Bridge Road, SE17EH London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ellen J Thompson
Elco Bakker
3Zoe Global Limited, 164 Westminster Bridge Road, London SE1 7RW, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Elco Bakker
M. Jorge Cardoso
2School of Biomedical Engineering & Imaging Sciences, King’s College London, SE1 7EU London UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for M. Jorge Cardoso
Richard Davies
3Zoe Global Limited, 164 Westminster Bridge Road, London SE1 7RW, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard Davies
Jonathan Wolf
3Zoe Global Limited, 164 Westminster Bridge Road, London SE1 7RW, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jonathan Wolf
Tim D Spector
1Department of Twin Research and Genetic Epidemiology, King’s College London, Westminster Bridge Road, SE17EH London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tim D Spector
Sebastien Ourselin
2School of Biomedical Engineering & Imaging Sciences, King’s College London, SE1 7EU London UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sebastien Ourselin
Claire J Steves
1Department of Twin Research and Genetic Epidemiology, King’s College London, Westminster Bridge Road, SE17EH London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Claire J Steves
Cristina Menni
1Department of Twin Research and Genetic Epidemiology, King’s College London, Westminster Bridge Road, SE17EH London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Cristina Menni
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Understanding the geographical distribution of COVID-19 through the general population is key to the provision of adequate healthcare services. Using self-reported data from 2,266,235 unique GB users of the COVID Symptom Tracker app, we find that COVID-19 prevalence and severity became rapidly distributed across the UK within a month of the WHO declaration of the pandemic, with significant evidence of “urban hot-spots”. We found a geo-social gradient associated with disease severity and prevalence suggesting resources should focus on urban areas and areas of higher deprivation. Our results demonstrate use of self-reported data to inform public health policy and resource allocation.

The COVID-19 epidemic has led to large-scale closures and lockdown measures across the world with the British government sanctioning lockdown from March 23rd 2020.

Early in the pandemic, case distribution did not appear to be evenly spread across countries, with dense urban centres being the most affected1. In tandem, individuals in deprived areas have lower life expectancy2, are more likely to have multiple underlying comorbidities, have a higher level of influenza-associated hospitalisation3, and therefore could be more susceptible to COVID-192. Based on the high COVID-19 prevalence in urban areas, and the known socioeconomic health gradient, we hypothesised that (i) individuals in deprived areas are at greater risk of both being exposed to and (ii) of experiencing adverse outcomes following contraction of COVID-19. Understanding the geographical distribution of the virus in a socioeconomic context is key to assist adequate healthcare resourcing, particularly intensive care beds4.

Here we investigated the GB geographical distribution of COVID-19 using self-reported data from over 2 million users of the COVID-19 Symptom Tracker5 app. We then identify specific geo-social, demographic and clinical factors associated with predicted COVID-19 prevalence and severity.

We studied 2,266,235 unique GB app users reporting daily on COVID-19 symptoms, hospitalisation, RT-PCR test outcomes, demographic information and pre-existing medical conditions over 24 days immediately after major social distancing measures were introduced in the GB (“lockdown”). We computed two proxies of contracting COVID-19: a predicted prevalence score6, (Positive Predicted Value (PPV)= 0.69[0.66; 0.71] and a predicted severity score based on symptoms associated with hospitalisation PPV=0.77(0.64;0.79).

Following aggregation of variables to local authority district level (LAD/geographic unit derived by the Office of National Statistics), we tested the geographical distribution of symptom severity and predicted prevalence using Global and Local Moran’s I tests which assess for non-random spatial distribution and clustering of a feature7.

We further employed linear regression adjusting for age, gender, obesity, co-morbidities and spatial autocorrelations (SAC)8 to assess the association between predicted COVID-19 prevalence and severity and geo-social-health factors at eight time points across 24 days using a seven-day window. Comorbidity and demographic data were included as percentage of respondents by middle super output area (MSOA, a more granular geographic unit).

The descriptive characteristics of the study population across the 8 time points are presented in Table 1. The number of predicted COVID-19 positive individuals ranged between 79,378 and 15,991, while the average severity (on a case 1-100) ranged from 4.16 and 1.47. Using local Moran’s I, we found that predicted COVID-19 prevalence and severity significantly clusters in urban areas across GB when considered as a proportion of the population per LAD (Figure 1 and Figure 1S) adjusting for multiple testing. We also found that predicted prevalence and severity decreased over time, likely as a result of “lockdown” (Figure 1 and Figure 2 a and b) (Pairwise Wilcoxon rank sum tests, Prevalence: all time points accept T2:T3 and T1:T4, P<0.001, Severity all time points P < 0.001). However, some hot-spots remained, suggesting in some areas social distancing measures/compliance may not have been equally effective.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1. Demographic characteristics of the study population at 8 time points. At each time point, we only consider users who have made an assessment in the last 7 days.
Figure 1.
  • Download figure
  • Open in new tab
Figure 1. Geographical Distribution of predicted COVID-19 prevalence across eight time points

The map was created using a shapefile of Local Authority District level data from the ONS using the geopandas package and Python. Overlaid over the map are statistically significant ‘hot-spots’ and ‘cold-spots’ at LAD level. To create these regions, a ‘neighbours list was calculated from a shapefile of the LADs using a queen contiguity condition. Spatial weights were calculated assuming equal weight of neighbouring areas, and spatially lagged values which represent the average neighbouring COVID-19 prevalence/severity for each LAD. We adjusted for multiple testing using the Benjamini & Hochberg method and used the ‘spdep’ package for the spatial components of our analysis10. The severity score was calculated using a weighted sum of symptoms a patient had, normalised between 0 and 1. The weights were found by using the frequency of symptoms amongst a training set of 993 respondents who reported visiting hospital for a COVID-19 related issue and having a positive RT- PCR test. Predicted severity= chest pain × 0.115 + severe or significant persistent cough × 0.104 + hoarse voice × 0.097 + skipped meals × 0.096 + loss of smell × 0.092 + severe fatigue × 0.089 + shortness of breath × 0.0855 + delirium × 0.083 + fever × 0.0814 + diarrhoea × 0.081 + abdominal pain × 0.076

Figure 2.
  • Download figure
  • Open in new tab
Figure 2. Predicted COVID- 19 A) prevalence and B) severity in the UK across eight time points.

Results of multivariate linear regression models for each of the time points, where the two COVID measures C) prevalence and D) severity at MSOA level were regressed against geosocial factors: the Index of Multiple Deprivation (IMD, binned into quintiles generated on the average IMD within each MSOA, where 1 is most deprived and 5 is least), a rural-urban gradient (RUC, considered as a continuous variable where 1 is the most urban and 8 is the most rural), General practitioners per population in MSOA (GPs/MSOA, where a higher number indicates more GPs per individual by MSOA), average household number (calculated as number of inhabited dwellings/MSOA population, where a higher number indicates a higher average number of individuals per household) and comorbidity and demographic data derived from app responses representing percentage of respondents by MSOA who reported having kidney, heart or lung disease, and who are diabetic, a smoker or obese (calculated as BMI<30). Finally, we derived age and sex variables to adjust for response bias and considered as demographic factors: these were calculated as the difference of the expected mean/ratio of age/sex in the MSOA (derived from ONS population data) and the observed mean/ratio of age/sex amongst respondents. Therefore, a positive number suggests we had less males/younger respondents than would be expected were our population to be the same as the average population by MSOA, and thus a positive association with COVID prevalence/severity suggests greater association with this reporting bias. A lagged variable was included to account for spatial autocorrelation. Only MSOAs where at least 20 individuals were considered (n = 8097). Standardised coefficients are reported. Analysis conducted in RStudio v1.1.423 and R v3.6.3.

In the more granular analysis, we find that urbanicity, area-level deprivation and average household size were positively associated to higher predicted COVID-19 prevalence and severity (P<0.01) across all time points; with stronger associations for severity than for prevalence (Figure 2 c and d). This suggests that people in deprived and/or urban areas remained at higher risk of more severe symptoms. Moreover, we see a positive trend between NOx pollution and COVID-19 prevalence and to a lesser extent severity.

Finally, as expected, we find the association of obesity, smoking and lung disease to be stronger with predicted COVID-19 severity than prevalence.

Here, we observe that predicted COVID-19 prevalence and severity is significantly higher in urban areas compared to rural, and in more deprived areas compared to less deprived. This could reflect the likelihood of individuals in more deprived areas to work in vocations, or live with people who work in vocations, where they are unable to work from home and are thus more likely to be exposed to circulating COVID-19. Accumulation of socio-environmental exposures across the life course are known to contribute to a greater health deficit and disease burden2; our results suggest that COVID- 19 is no exception.

Moreover, our study illustrates how app data could be used to successfully monitor COVID-19 over time and identify hotspots as the viral pandemic progresses and social distancing measures are implemented or eased. Using this method, we detected a geo-social gradient associated with disease severity and prevalence in the context of COVID-19 suggesting resources should focus on urban areas, areas with highest deprivation, higher average household number and higher air pollution.

Our study has some limitations and assumptions. First, we used self-reported data. Second, volunteers using the app are a self-selected group that may not be fully representative of the general population. Third, our data on COVID-19 incidence is not from confirmed tests via RT-PCR, but rather from two variables predicted based on user responses. Additionally, we assume that people who suffer from COVID-19 are equally likely to use the app as those who do not. We also assume that people will report the symptoms in the same way. Finally, we aggregated data at MSOA level as we did not have enough respondents for more granular geography.

Future work could seek to integrate this data with data on area-level morbidity, extended pollution data and ethnicity. Indeed higher mortality has been observed amongst minority ethnic groups9 and disentangling the environmental and biological factors contributing to greater disease burden in both deprived areas and among ethnic minorities is an essential focus of future work to ensure resources and intervention are better assigned.

Data Availability

Anonymised research data will be shared with third parties via the centre for Health Data Research UK (HDRUK.ac.uk). Data updates can be found on https://covid.joinzoe.com

Conceived and designed the experiments

CJS, TDS, SO, CM;

Analysed the data

RCEB, TV.

Contributed reagents/materials/analysis tools

MF, CHS, BM, MBF, DY, SG, JC, EJT,EB, MJC, RD, JW

Wrote the manuscript

RCEB, TV, CM

Revised the manuscript

all

Competing interests

TDS is a consultant to Zoe Global Ltd (“Zoe”). SG, JC, EB, RD, JW are or have been employees of Zoe Global Limited. Other authors have no conflict of interest to declare.

Data Sharing

Anonymised research data will be shared with third parties via the centre for Health Data Research UK (HDRUK.ac.uk). Data updates can be found on https://covid.joinzoe.com

Ethics

The Ethics for the app has been approved by KCL ethics Committee and all users provided consent for non-commercial use. An informal consultation with TwinsUK members over email and social media prior to the app having been launched found that they were overwhelmingly supportive of the project.

Figure Legends

Figure S1.
  • Download figure
  • Open in new tab
Figure S1. Geographical Distribution of predicted COVID-19 severity across eight time points

Acknowledgements

This work was supported by Zoe Global. The Department of Twin Research is funded by the Wellcome Trust, Medical Research Council, European Union, Chronic Disease Research Foundation (CDRF), Zoe Global Ltd and the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London. CM is funded by the Chronic Disease Research Foundation and by the MRC Aim-Hy project grant. CHS is an Alzheimer’s Society Junior Fellowship AS-JF-17-011; SO and MJC are funded by the Wellcome/EPSRC Centre for Medical Engineering (WT203148/Z/16/Z), Wellcome Flagship Programme (WT213038/Z/18/Z).

We express our sincere thanks to all the participants of the COVID Symptom Tracker app. We thank the staff of Zoe Global Limited, the Department of Twin Research for their tireless work in contributing to the running of the study and data collection.

References

  1. 1.↵
    Stier A, Berman M, Bettencourt L. COVID-19 Attack Rate Increases with City Size. Preprint at https://papersssrncom/sol3/paperscfm?abstract_id=3564464 2020
  2. 2.↵
    Marmot M. Health equity in England: the Marmot review 10 years on. BMJ 2020;368:m693. doi: 10.1136/bmj.m693 [published Online First: 2020/02/26]
    OpenUrlFREE Full Text
  3. 3.↵
    Hungerford D, Ibarz-Pavon A, Cleary P, et al. Influenza-associated hospitalisation, vaccine uptake and socioeconomic deprivation in an English city region: an ecological study. BMJ Open 2018;8(12):e023275. doi: 10.1136/bmjopen-2018-023275 [published Online First: 2018/12/24]
    OpenUrlAbstract/FREE Full Text
  4. 4.↵
    Blumenshine P, Reingold A, Egerter S, et al. Pandemic influenza planning in the United States from a health disparities perspective. Emerg Infect Dis 2008;14(5):709–15. doi: 10.3201/eid1405.071301 [published Online First: 2008/04/29]
    OpenUrlCrossRefPubMedWeb of Science
  5. 5.↵
    Drew DA, Nguyen L, Steves CJ, et al. Rapid implementation of mobile technology for real-time epidemiology of COVID-19, 2020.
  6. 6.↵
    Menni C, Valdes AM, Freidin MB, et al. Loss of smell and taste in combination with other symptoms is a strong predictor of COVID-19 infection. Preprint at https://wwwmedrxivorg/content/101101/2020040520048421v1 2020
  7. 7.↵
    Rezaeian M, Dunn G, St Leger S, et al. Geographical epidemiology, spatial analysis and geographical information systems: a multidisciplinary glossary. J Epidemiol Community Health 2007;61(2):98–102. doi: 10.1136/jech.2005.043117 [published Online First: 2007/01/20]
    OpenUrlAbstract/FREE Full Text
  8. 8.↵
    Anselin L, Griffith DA. Do spatial effects really matter in regression analysis? Papers - Regional Science Association 1988
  9. 9.↵
    Khunti K, Singh AK, Pareek M, et al. Is ethnicity linked to incidence or outcomes of covid-19? BMJ 2020;369:m1548. doi: 10.1136/bmj.m1548 [published Online First: 2020/04/22]
    OpenUrlFREE Full Text
  10. 10.↵
    Bivand RS, Wong DWS. Comparing implementations of global and local indicators of spatial association. Test-Spain 2018;27(3):716–48. doi: 10.1007/s11749-018-0599-x
    OpenUrlCrossRef
Back to top
PreviousNext
Posted April 27, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Geo-social gradients in predicted COVID-19 prevalence and severity in Great Britain: results from 2,266,235 users of the COVID-19 Symptoms Tracker app
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Geo-social gradients in predicted COVID-19 prevalence and severity in Great Britain: results from 2,266,235 users of the COVID-19 Symptoms Tracker app
Ruth C E Bowyer, Thomas Varsavsky, Carole H. Sudre, Benjamin A K Murray, Maxim B Freidin, Darioush Yarand, Sajaysurya Ganesh, Joan Capdevila, Ellen J Thompson, Elco Bakker, M. Jorge Cardoso, Richard Davies, Jonathan Wolf, Tim D Spector, Sebastien Ourselin, Claire J Steves, Cristina Menni
medRxiv 2020.04.23.20076521; doi: https://doi.org/10.1101/2020.04.23.20076521
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Geo-social gradients in predicted COVID-19 prevalence and severity in Great Britain: results from 2,266,235 users of the COVID-19 Symptoms Tracker app
Ruth C E Bowyer, Thomas Varsavsky, Carole H. Sudre, Benjamin A K Murray, Maxim B Freidin, Darioush Yarand, Sajaysurya Ganesh, Joan Capdevila, Ellen J Thompson, Elco Bakker, M. Jorge Cardoso, Richard Davies, Jonathan Wolf, Tim D Spector, Sebastien Ourselin, Claire J Steves, Cristina Menni
medRxiv 2020.04.23.20076521; doi: https://doi.org/10.1101/2020.04.23.20076521

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Addiction Medicine (227)
  • Allergy and Immunology (501)
  • Anesthesia (110)
  • Cardiovascular Medicine (1233)
  • Dentistry and Oral Medicine (206)
  • Dermatology (147)
  • Emergency Medicine (282)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (529)
  • Epidemiology (10012)
  • Forensic Medicine (5)
  • Gastroenterology (498)
  • Genetic and Genomic Medicine (2448)
  • Geriatric Medicine (236)
  • Health Economics (479)
  • Health Informatics (1636)
  • Health Policy (751)
  • Health Systems and Quality Improvement (635)
  • Hematology (248)
  • HIV/AIDS (532)
  • Infectious Diseases (except HIV/AIDS) (11860)
  • Intensive Care and Critical Care Medicine (625)
  • Medical Education (252)
  • Medical Ethics (74)
  • Nephrology (268)
  • Neurology (2277)
  • Nursing (139)
  • Nutrition (350)
  • Obstetrics and Gynecology (452)
  • Occupational and Environmental Health (534)
  • Oncology (1245)
  • Ophthalmology (375)
  • Orthopedics (133)
  • Otolaryngology (226)
  • Pain Medicine (155)
  • Palliative Medicine (50)
  • Pathology (324)
  • Pediatrics (729)
  • Pharmacology and Therapeutics (311)
  • Primary Care Research (282)
  • Psychiatry and Clinical Psychology (2280)
  • Public and Global Health (4828)
  • Radiology and Imaging (834)
  • Rehabilitation Medicine and Physical Therapy (490)
  • Respiratory Medicine (650)
  • Rheumatology (283)
  • Sexual and Reproductive Health (237)
  • Sports Medicine (226)
  • Surgery (266)
  • Toxicology (44)
  • Transplantation (125)
  • Urology (99)