Household overcrowding and risk of SARS-CoV-2: analysis of the Virus Watch prospective community cohort study in England and Wales

Background: Household overcrowding is associated with increased risk of infectious diseases across cultures and countries. Limited data exist in England and Wales linking household overcrowding and risk of COVID-19. We used data collected from the Virus Watch cohort to examine the association between overcrowded households and infection to pandemic coronavirus SARS-CoV-2. Methods: The Virus Watch study is a household community cohort of acute respiratory infections in England & Wales that began recruitment in June 2020. We calculated the persons per room for each household and classified accommodation as overcrowded when the number of roomswas fewer than the number of people. We considered two primary outcomes - PCR-confirmed positive SARS-CoV-2 antigen tests and laboratory confirmed SARS-CoV-2 antibodies (Roche Elecsys anti-N total immunoglobulin assay). We used mixed effects logistic regression models that accounted for household structure to estimate the association between household overcrowding and SARS-CoV-2 infection. Results: The proportion of participants with a positive SARS-CoV-2 PCR result was highest in the overcrowded group (6.6%; 73/1,102) and lowest in the under-occupied group (2.9%; 682/23,219). In a mixed effects logistic regression model that included age, sex, ethnicity, household income and geographical region as fixed effects, and a household-level random effect, we found strong evidence of an increased odds of having a positive PCR SARS-CoV-2 antigen result (Odds Ratio 3.67; 95% CI: 1.91, 7.06; p-value < 0.001) and increased odds of having a positive SARS-CoV-2 antigen result in individuals living in overcrowded houses (2.99; 95% CI: 1.14, 7.81; p-value =0.03) compared to people living in under-occupied houses. Discussion: Public health interventions to prevent and stop the spread of SARS-CoV-2 should consider the much greater risk of infection for people living in overcrowded households and pay greater attention to reducing household transmission. There is an urgent need to better recognise housing as a leading determinant of health in the context of a pandemic and beyond.


Introduction
Household overcrowding is associated with increased risk of infectious diseases across cultures and countries. 1 The World Health Organization Housing and Health Guidelines emphasise the health risks of overcrowding and note its complex economic, social and political determinants. 2,3 Several definitions of overcrowding exist and it is also used as an indicator of material deprivation. 4 Measures of overcrowding assess whether there is adequate dwelling space for occupants' needs related to shelter, space and privacy. 1 According to the English Housing Survey, approximately 787,000 (3%) of English households are overcrowded with unequal distribution across regions and social groups. 5 Seven percent of the most deprived households were overcrowded compared with less than half a percent of the least deprived households.
Overcrowding was highest in London compared to all other English regions. White British households are less likely to be overcrowded than households from all other ethnic groups.
There is strong evidence that household overcrowding is associated with risk of infectious diseases such as tuberculosis, 1 and increasing evidence on the association between household overcrowding and COVID-19. In the USA, an ecological analysis found that COVID-19 death rates were higher in the counties with highest percentage of household crowding (16.8 per 100,000) compared to the least crowded areas (4.9 per 100,000). 6 A study using 2011 UK census data examined ethnicity, household composition and COVID-19 mortality and found that elderly adults living with younger people were at increased risk of COVID-19 mortality. 7 The study described the distribution of overcrowded housing within census data used, but it did not adjust for overcrowded housing in the causal mediation analysis as these were considered consequences of living in a multi-generational household rather than confounding factors. A study of UK Biobank data found household size was associated with COVID-19 infection after adjusting for age, sex, deprivation, ethnicity and Body Mass Index (BMI). 8 The REACT study found SARS-CoV-2 antibodies increased from 4.7% for people in single person households to 13% in households of seven or more. 9,10 The OpenSAFELY study found household size accounted for 10-risk of COVID-19 mortality in South Asian groups. 11 In Birmingham (UK), an analysis of data from 408 hospitalised COVID-19 patients found that people from areas of the city with low housing quality, overcrowding and high air pollution were more likely to be admitted to intensive care compared to other areas. 12 Additional evidence on the association between household overcrowding and COVID-19 risk can guide immediate public health mitigation measures to reduce the spread of the SARS-CoV-2 and inform long-term housing policy in the UK. We used data collected from the Virus Watch cohort to examine the association between overcrowded households and either Polymerase Chain Reaction (PCR) confirmed SARS-CoV-2 or antibodies to acquired through SARS-CoV-2 infection. 13

Methods:
The Virus Watch study is a household community cohort of acute respiratory infections in England & Wales that started recruitment in June 2020. 13 As of 3rd May 2021, there were 50,648 participants in Virus Watch that were recruited using a range of methods including post, social media, and SMS messages and letters from their General Practices (Table 1). Participants were followed-up weekly by email with a link to an illness survey which asks about the presence or absence of symptoms that could indicate COVID-19 disease including respiratory, gastrointestinal and general infection symptoms. The weekly survey is also used to capture SARS-CoV-2 test results received from outside the study (eg. via the UK Test-Trace-Isolate system).

Laboratory cohort
Nested within this larger study is a sub-cohort of 10,330 adults (aged over 18) participating in monthly antibody testing who completed at-home capillary blood sampling kits sent via post on a monthly basis, and provided self-reported vaccination data on a weekly basis, in addition to demographic and clinical data at baseline. Individuals were included in this analysis if they underwent antibody testing between 1st February 2021 and 3rd May 2021 and completed the February 2021 monthly survey.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Monthly survey
The Virus Watch monthly survey includes demographic, psychosocial/behavioral, environmental and health-related questions beyond the scope of the weekly survey.
Data used in this analysis are taken from the Virus Watch third monthly survey that was conducted between 09 February and 16 February 2021 and occurred during the third national lockdown for both England and Wales. In this survey we included a series of questions about participants' housing status adapted from housing-related items in the 2011 England and Wales Census as this the most comprehensive study in terms of housing characteristics collected. 14 These comprised of: accommodation type; whether accommodation was self-contained; number of rooms available for exclusive use by the household, excluding bathrooms, toilets, halls or landings, or rooms that can be used only for storage (e.g., cupboards); number of bedrooms (built or converted for use as bedrooms, even if not currently used as a bedroom); housing tenure and, if rented, details of the rental arrangement; and whether the accommodation had central heating. Additionally, we presented binary (yes/no) questions about whether participants' accommodation had visible mould or fungus and/or damp spots on the walls or ceiling due to their association with respiratory tract infections. 15,16 Housing-related survey questions are presented in full in the supplementary appendix.
Overcrowding measure: To investigate overcrowding, we first calculated the persons per room 4 for each household, which is defined as the number of household occupants divided by the number of rooms, excluding kitchen or bathrooms. We chose this measure because the conceptual link between overcrowding and infectious risk likely relates more to the number of people in the house rather than the family and demographic (e.g. age and sex) relationships between them, such as the bedroom standard which often allows shared bedrooms for each pair of same-sex adolescents (10-20 years) and each pair of same or opposite sex children (under 10 years old). 4 We did not ask participants to exclude kitchens from their reporting of the number of rooms in their accommodation and therefore we have assumed that all accommodation had a kitchen and subtracted one from the total number of rooms reported by each household. The persons per room variable was then categorised as: 1) under-. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2021. ; occupied accommodation where the number of rooms were greater than the number of people; 2) balanced accommodation where the number of rooms was equal to the number of people; 3) overcrowded accommodation where the number of rooms was fewer than the number of people.

Outcomes:
We considered two primary outcomes. First, PCR-confirmed SARS-CoV-2; either self-reported or conducted as part of the Virus Watch laboratory cohort. This was assessed in an analysis including all Virus Watch participants who completed the February 2021 monthly survey. We assumed that participants not reporting either a positive or negative test result had never been a PCR-confirmed SARS-CoV-2 case.
Second, we examined laboratory confirmed SARS-CoV-2 antibodies acquired through infection, among participants who underwent antibody testing as part of the VirusWatch study and who completed the February 2021 monthly survey. The main outcome variable was evidence of prior infection defined as a cut off index of 0.1 or more on the Roche Elecsys anti-N total immunoglobulin assay that measures seropositive for the Nucleocapsid protein.

Covariates:
Our analysis strategy was informed by conceptual models 17,18 that have previously described the possible pathways between ethnicity and socio-economic status, overcrowding and risk of SARS-CoV-2 infection and we developed a Directed Acyclic Graph to inform covariate selection (Supplementary Appendix Figure S1).
We used this and the rules outlined by VanderWeele as principles of confounder selection. 19 In our primary analysis we considered age, sex, ethnicity, household income and geographical region as covariates (Model A). Age, sex, ethnicity and geographical region were derived from participants' responses to demographic questions at study baseline. Household income was derived from the February 2021 monthly survey.

Statistical Analyses
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2021. ; We undertook description of the characteristics of included participants. To model the association between the selected covariates and each outcome, we conducted univariable and multivariable analyses using mixed effects logistic regression models with a household-level random effect to account for household-level variation not explained by the covariates using the glmer function in R 4.0.3.

Sensitivity Analyses
We have examined our assumption regarding kitchens in our calculation of overcrowding, by only excluding one room from households with 2 or more rooms in a sensitivity analysis. In addition to the primary mixed effects model, we ran three other separate sensitivity analyses (all accounting for household-level clustering): first a minimal sufficient adjustment set informed by our Directed Acyclic Graph (Model B); second a model with age, sex, households with children (a binary variable representing households with and without children), and number of close contacts outside the household as covariates (repeating an analysis previously reported prior to having occupational category or household income data available (Model C); third a model with age, sex, ethnicity, income and occupation as covariates (Model D).

Results
On 16th February 2021, participants in the Virus Watch study were invited to take part in the monthly survey and we were able to include responses to the survey for 26,394 participants in our primary analyses. The median number of rooms per household was 6 (interquartile range 5-7) and the median number of householders There were 867 participants with a PCR confirmed SARS-CoV-2 antigen test and . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2021. ; data on their housing status (3.3%; 867/26,394). The proportion of participants with a positive SARS-CoV-2 antigen result was highest in the overcrowded group (6.6%; 73/1,102) and lowest in the under-occupied group (2.9%; 682/23,219; Table 2). In a mixed effects logistic regression model that included age, sex, ethnicity, household income and geographical region as fixed effects and a household-level random  Table 3). The proportion of variation at the household level was 9.91% (standard deviation 3.15).
As of 3rd May 2021 we had SARS-CoV-2 anti-N total immunoglobulin assay antibody results for 10,330 participants in the laboratory cohort with housing data ( Table 2). The proportion of participants with a positive SARS-CoV-2 antibody result was highest in the overcrowded group (20.7%; 41/198) and lowest in the underoccupied group (7.6%; 733/9,605; Table 2). In a mixed effects logistic regression model that included age, sex, ethnicity, household income and geographical region as fixed effects and a household-level random effect, we found evidence of an  (Table 4).
Sensitivity analyses were consistent with the finding that overcrowding was associated with increased risk of PCR and anti-N total immunoglobulin assay antibody confirmed SARS-CoV-2 (Table 5). In a sensitivity analysis where we did not subtract 1 from the number of rooms there was evidence of an increased odds of having a positive SARS-CoV-2 antigen result in balanced (OR: 3.37; 95% CI: 1.52, 7.45; p-value = 0.003) and overcrowded households (OR: 2.53; 95% CI: 0.96, 6.62; p-value = 0.059) compared to those in under-occupied houses.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ;

Discussion
We estimate that overcrowded households have between two to four times the odds of PCR confirmed SARS-CoV-2 and anti-N total immunoglobulin assay antibody positivity compared to under-occupied households. This elevated odds of confirmed SARS-CoV-2 remained after we adjusted for differences in demographic and other socio-economic factors between households that may increase risk of being infected.
We also found that people in accommodation considered balanced -where the number of rooms was equal to the number of people -were at increased risk of PCR confirmed SARS-CoV-2 compared to under-occupied houses.
Virus Watch is a large national household community cohort study of the occurrence and risk factors for SARS-CoV-2 infection and is designed to estimate incidence of PCR confirmed SARS-CoV-2 and measure effectiveness and impact of recommended COVID-19 control measures. Individuals in the study are geographically distributed across England and Wales and the cohort is diverse in terms of age, sex, ethnicity and socio-economic composition, and levels of overcrowding were comparable with national estimates for England and Wales.
There were more participants taking part in the February 2021 monthly survey that were over the age of 65 compared to the general population in England and Wales, more people of White British ethnicity, and more people of household size 2. Virus Watch is limited by the fact that only households with a lead householder able to speak English and access the internet were able to take part in the study. 13 An important additional limitation is that only households of up to six people were eligible for inclusion. By not including households of over six people we are likely underestimating the risk associated with overcrowding. Conversely, we cannot rule out residual confounding as a possible alternative explanation for some of the excess risk in the associations we describe.
Access to SARS-CoV-2 antigen testing is socio-economically patterned, with those in more deprived areas having less ability to access tests and less likely to be contact traced. 20 Our data may therefore under-estimate the true infection risk associated with overcrowded housing. To mitigate this bias in access to antigen . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ; testing, we analysed antibody data from the laboratory cohort where home test kits were provided to all adult participants.
Our findings highlight the importance of public health interventions to prevent and stop the spread of SARS-CoV-2 considering the much greater risk of being infected for people living in overcrowded households. The pathways between overcrowding and increased risk of infection are not fully understood. 10 Overcrowded households are more likely to be found in socioeconomically deprived and urban areas, and higher levels of COVID-19 in these communities include higher likelihood of working in public-facing and essential occupations, lower likelihood of working from home during lockdowns, constraints to self-isolation when ill or in contact with a case (both financial and due to lack of suitable space within an overcrowded house), increased number of household members sharing of common spaces and facilities.
Overcrowded households are also more to suffer from environmental exposures that have been linked to COVID-19 including air pollution. 12 Overcrowding is an important risk through which COVID-19 inequalities manifest. To address social, ethnic and regional inequalities in COVID-19 outcomes, public health responses should explicitly address overcrowded housing. Infectious cases in overcrowded houses are less likely to be able to isolate from other household members or avoid using shared spaces including bathrooms. However, existing advice on the importance of infectious cases using masks and ventilation through opening of windows should be supported by tailored communications strategies. 21 Schemes that provide hotel accommodation for cases with vulnerable residents in overcrowded households, such as those used internationally 22,23 and recently introduced in the London Borough of Newham 24 , should be considered a priority public health intervention. Such community quarantine and isolation options require clear, inclusive guidance with specific advice for large households and multigenerational families. 10 Although the UK does not target COVID-19 vaccine according to social factors, our results highlight the need to ensure high vaccine coverage in areas with high levels of overcrowding. There is currently evidence of lower COVID-19 vaccination rates in such areas, 25 highlighting the importance of working closely with local community groups to increase vaccination intention and ensuring vaccination is highly accessible.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ; Improving the standard and supply of housing in the UK is important for the COVID-19 recovery as 32% of all households are estimated to experience overcrowding, affordability challenges or non-decent housing that influence their risk. 26 In relation to these challenges, a paper by the ethnicity sub-group of the Scientific Advisory Group for Emergencies (SAGE) recommends the following measures to reduce the risk of transmission within households: provision of emergency grants for repair and maintenance of social and private rental housing, particularly to increase ventilation; removing the benefit cap and welfare restrictions, particularly in high housing cost areas and for immigrant families; reviewing implementation of the bedroom tax for those in multi-person households; and investment in affordable childcare and alternative community spaces for social connection, particularly for the elderly. 10 Addressing England's overcrowding challenge is complex and requires action from multiple government and private sector stakeholders. A recent House of Commons Library briefing report has highlighted actions related to the following: revising the statutory housing standard, increasing the housing supply and size of homes/rooms, and measures to reduce under-occupation in social housing. 27 Breaching the statutory overcrowding standard is a criminal offence, yet the threshold is considered to be too low such that very few homes are statutorily overcrowded, reducing local authorities' ability to take action on this problem. 27 Another significant challenge is that housing supply has fallen short of demand for decades. 28 There are a wide range of factors affecting housing supply (and therefore also influencing affordability and overcrowding in England) including: construction capacity, planning regulation, public opposition to new development, overseas buyers, overreliance on the private sector and underuse of social approaches to housing development, among other factors. 29 The size of English homes may also be an important factor as they are among the smallest in Europe 30 and the Nationally Described Space Standard 31 for new homes is only mandatory if implemented through local planning policies following tests for need and viability. Finally, although there are under-occupied and empty homes in England, these are not suitable to solve the housing supply problem due to their location, ownership and other factors. 32 Housing is an important determinant of health for many physical and mental health conditions and COVID-19 has exposed, in real-time, vulnerabilities in the English . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ; housing stock, particularly for low-income and minority ethnic populations. Looking forward, the country's health and resilience to future pandemics and other risks, such as climate change 33,34 , are dependent on the state of the housing stock. This analysis of overcrowding and COVID-19 underscores the need to improve housing for health. As part of the agenda to Build Back Better 35 and fairer [36][37][38] , investment in sustainable, high-quality and more affordable housing will support health, jobs and the wider economic recovery so that we are prepared for the key challenges of the 21st century.

Data availability
We aim to share aggregate data from this project on our website and via a "Findings so far" section on our website -https://ucl-virus-watch.net/. We will also be sharing individual record level data on a research data sharing service such as the Office of National Statistics Secure Research Service. In sharing the data we will work within the principles set out in the UKRI Guidance on best practice in the management of research data. Access to use of the data whilst research is being conducted will be managed by the Chief Investigators (ACH and RWA) in accordance with the principles set out in the UKRI guidance on best practice in the management of research data. We will put analysis code on publicly available repositories to enable their reuse.

Funding:
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ; . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ; Table 2. Description of housing characteristics for Virus Watch participants . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ; Table 3. Univariable and multivariable mixed effects logistic regression models examining the association between household overcrowding and SARS-CoV-2 PCR test results. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021. ;  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2021 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 11, 2021. ; https://doi.org/10.1101/2021.05.10.21256912 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021.
Model C: age, sex, households with children, and number of close contacts outside the household as covariates (repeating a analysis previously reported prior to having occupational category or household income data available). Model D: sex, ethnicity income and occupation as covariates.
Model E: age, sex, ethnicity, household income and geographical region included as covariates with overcrowding exposure recalculated by only excluding one room from households with 2 or more room. .

CC-BY 4.0 International license
It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted May 11, 2021.