Abstract
Background Fit notes (“sick notes”) are issued by general practitioners (GPs) when a person can’t work for health reasons and is an indication of the public health and economic burden for people recovering from COVID-19.
Methods With NHS England approval, we used routine clinical data from >24 million patients to compare fit note incidence in people 18-64 years with and without evidence of COVID-19 in 2020, 2021 and 2022. We fit Cox regression models to estimate adjusted hazard ratios, overall and by time post-diagnosis and within demographic subgroups.
Results We identified 365,421, 1,206,555 and 1,321,313 people with evidence of COVID-19 in 2020, 2021 and 2022. The fit note rate was 4.88 per 100 person-months (95%CI 4.83-4.93) in 2020, 2.66 (95%CI 2.64-2.67) in 2021, and 1.73 (95%CI 1.72-1.73) in 2022. Compared with the age, sex and region matched general population, the hazard ratio (HR) adjusted for demographics and clinical characteristics over the follow-up period was 4.07 (95%CI 4.02-4.12) in 2020 decreasing to 1.57 (95%CI 1.56-1.58) in 2022. The HR was highest in the first 30 days post-diagnosis in all years.
Conclusions Despite likely underestimation of the fit note rate, we identified a considerable increase among people with COVID-19, even in an era when most people are vaccinated. Most fit notes are associated with the acute phase of the disease, but the increased risk several months post-diagnosis provides further evidence of the long-term impact.
Evidence before this study We searched Pubmed from 1 March 2020 to 30 June 2023 using the following search terms: (“COVID-19” OR “SARS-CoV-2” OR “coronavirus”) AND (“United Kingdom” OR “England” OR “Britain” OR “Scotland” OR “Wales”) AND (“fit note” OR “sick note” OR “sick leave” OR “sickness absence”). We also searched the reference list of relevant articles. We included both peer-reviewed research studies and grey literature that quantified receipt of fit notes or sick leave during the COVID-19 pandemic.
We found two peer-reviewed studies and one briefing by an independent think tank. A study of 959,356 National Health Service (NHS) employees in England quantified receipt of non-COVID-19 related fit notes during the first wave of the pandemic. They found that the overall fit note rate was lower in 2020 compared with 2019. However, increases in the number of people receiving fit notes were observed for respiratory, infectious disease, and mental health conditions. The second study of 15,931 domiciliary care workers in Wales between Mar 2020 and Nov 2021 found that 15% had been issued a fit note over the study period. Fit notes were more common among women, people ≥45 years, and those with comorbidities. The briefing found that the percentage of sickness absence days taken by NHS employees was higher in 2022 (5.6%) compared with 2019 (4.3%), with a particular increase in absences due to mental health and infectious diseases. In 2022, 18% of sickness absence days were attributable to COVID-19.
Added value of this study This study is the first to quantify changes in fit note rate since the start of the COVID-19 pandemic among people with a reported SARS-CoV-2 infection and how this compares with the general population in the UK. We found that people with evidence of SARS-CoV-2 infection had a higher fit note rate than the general population, even after adjusting for demographics and clinical characteristics. While this increased risk was greatest in 2020 (hazard ratio [HR] = 4.07, 95%CI 4.02-4.12), it continued to a lesser extent even into 2022 (HR = 1.57, 95%CI 1.56-1.58). The fit note rate was greatest in the first 30 days post-diagnosis, suggesting that most sick leave is associated with the acute phase. In subgroup analyses, the groups with the greatest relative increased risk changed over the years. People aged 18-24 years had a larger relative increased risk of fit notes (as measured by HR) in 2022 than 2021, when compared with the general population in each year. Additionally, while in 2020 and 2021 the HR increased along with lessening deprivation, this effect dissipated in 2022. In contrast, people hospitalised with COVID-19 were less likely to be issued a fit note than the pneumonia cohort, suggesting the long-term effects may be similar to comparable severe respiratory infections cases resulting in hospitalisation.
Implications of all the available evidence While we have likely underestimated the fit note rate due to overcounting of people in the workforce and misclassification of COVID-19 status, we still identified a substantial increased risk of receiving a fit note in people with COVID-19 compared with the general population over all years, even after adjusting for demographics and a wide range of clinical characteristics. The increased risk persisted into 2022, in an era where most people are vaccinated and the severity of COVID-19 illness is lessened. Given the high infection rates still occurring, these findings provide evidence for a substantial impact of COVID-19 on productivity and further evidence of the long-term impacts of COVID-19.
Background
In primary care, a doctor may issue a fit note (also called “sick note”) after the first seven days of sickness absence if the doctor assesses that the patient’s health affects their fitness for work. In 2021-22, over 11 million fit notes were issued in England(1). Long-term sickness absence from employment has negative consequences for the economy as well as individuals and can lead to widened health inequalities, financial insecurity, and reduced social participation.(2) Improving the health and productivity of the population and reducing welfare benefit claims are important policy objectives.(3,4)
The Office of National Statistics (ONS) estimated that 82% of the English population had been infected with SARS-CoV-2 by November 2022.(5,6) Additionally, an estimated 1.9 million people self-reported as experiencing long COVID symptoms in the UK in the four weeks prior to March 2023, with two-thirds experiencing symptoms for at least one year.(7) The prevalence of long-term symptoms was greatest in people aged 35-69 years, women, and people living in more deprived areas.(7) Previous UK research has also shown that receipt of fit notes varies by sociodemographics, with women and people in manual and service occupations more likely to be issued a fit note.(4) The most common reasons for receiving a fit note prior to the COVID-19 pandemic included mental health and musculoskeletal disorders.(1,8) To date, there has been limited research on fit notes issued to patients recovering from COVID-19 in England.
Given the risk of long-term symptoms following COVID-19 (“long COVID”), it is of public health interest to quantify the impact of SARS-CoV-2 infection and recovery on the workforce. However, long COVID is heterogeneous and difficult to define.(9,10) Furthermore, coding of long COVID in primary care is very low,(11) and cannot be relied on to identify people suffering from persistent symptoms. Therefore, our objectives were to: 1) describe the demographic and clinical characteristics of people given a fit note following a documented SARS-CoV-2 infection or COVID-19 diagnosis; 2) determine how the fit note rate varies over time post-diagnosis; and 3) quantify the difference in fit note rate in people with SARS-CoV-2 infection or COVID-19 diagnosis compared with the general population and people hospitalised with pneumonia.
Methods
Study design
We conducted an observational cohort study using general practice primary care electronic health record (EHR) data from primary care practices in England.
Data source and data sharing
We used primary care data from approximately 40% of the English population currently registered with GP surgeries using TPP SystmOne software. All data were linked, stored and analysed securely using the OpenSAFELY platform, https://www.opensafely.org/, as part of the NHS England OpenSAFELY COVID-19 service. Data include pseudonymised data such as coded diagnoses, medications and physiological parameters. No free text data are included. All code is shared openly for review and re-use under MIT open license (https://github.com/opensafely/long-covid-sick-notes). Similarly, pseudonymised datasets including ONS registered deaths, hospital episode statistics (HES), and Second Generation Surveillance System (SGSS) COVID-19 test results are securely provided to TPP and linked to primary care data. Detailed pseudonymised patient data are potentially re-identifiable and not shared.
Study population
We included adults 18-64 years registered with one GP for at least one year prior to their index date with information on age, sex, index of multiple deprivation (IMD) and the Sustainability and Transformation Partnership region (STP, an NHS administrative region). This age range was selected to represent people most likely to be in the workforce. From this source population, we identified three cohorts with recorded SARS-CoV-2 infection between 1 February and 30 November in each of 2020, 2021 and 2022 (hereafter referred to as the “COVID-19 cohorts”). They were identified through three routes: having a recorded positive test for SARS-CoV-2 based on SGSS data, which captures both polymerase chain reaction (PCR) and lateral flow tests (LFT); having a probable diagnostic code for COVID-19 in primary care records; or being hospitalised with a primary or secondary diagnosis for COVID-19 (ICD-10 codes U07.1 or U07.2) identified from HES. The index date was the earliest of these events. As comparators, we identified contemporary (2020, 2021, 2022) and one historical (2019) general population cohort who were frequency-matched on age, sex and STP to the COVID-19 cohorts. The 2019 general population cohort was matched with the 2020 COVID-19 cohort. For the comparator cohorts, the index date was randomly assigned and randomly distributed over the study period.
We also performed a secondary analysis among the subset of people hospitalised with COVID-19. Here, the index date was the date of admission. We compared these individuals with people hospitalised with pneumonia between 1 February 2019 and 30 November 2019. The pneumonia cohort was identified using the following ICD-10 codes identified in any diagnosis position: B01.2, B05.2, B20.6, B25.0, J10-J18, J85.1, U04. No contemporary pneumonia comparator cohort was included due to potential for misclassification with COVID-19. People could be included in multiple cohorts if they met the inclusion criteria.
Study measures
Primary outcome
We identified the first recorded fit note over patient follow-up. Fit notes were identified using Clinical Terms Version 3 (CTV3) codes. People who were required to isolate beyond 7 days due to having or living with someone with symptoms of COVID-19 would be issued an isolation note, rather than a fit note. Isolation notes did not require contact with a GP, and are not counted. Follow-up was censored at the end of the study period (30 November), death, or deregistration from their GP practice. The comparator cohorts were also censored if they had a recorded positive SARS-CoV-2 test or COVID-19 diagnosis.
Demographics and clinical characteristics
We characterised the demographic and clinical characteristics of patients who received a fit note. Measured demographic variables included age (18-24, 25-34, 35-44, 45-54, 55-64 years), sex (male, female), GP practice region (East Midlands, East, London, North East, North West, South East, South West, West Midlands, Yorkshire and The Humber), ethnicity (White, Asian and British Asian, Black, Other, Unknown), and Indices of Multiple Deprivation (IMD) quintiles.
Clinical variables included indicators for pre-existing conditions that may impact the severity of COVID-19, specifically: asthma, cancer (haematological, lung, other), chronic cardiac disease, chronic liver disease, chronic respiratory disease (excluding asthma), diabetes, asplenia, human immunodeficiency virus (HIV) infection, hypertension, obesity, organ transplant, neurological conditions, other permanent immunodeficiency (excluding HIV), smoking status (current, former or never), and autoimmune conditions (rheumatoid arthritis, systemic lupus erythematosus (SLE), psoriasis). Obesity and smoking status were identified using the most recent recorded information, while the remainder were identified at any time prior to the index date.
Statistical methods
Rate of first fit note
For each of the eleven cohorts (three COVID-19 cohorts [2020, 2021, 2022], four matched general population cohorts [2019, 2020, 2021, 2022], three hospitalised COVID-19 cohorts [2020, 2021, 2022] and one hospitalised pneumonia cohort [2019]), we reported the first fit note rate per 100 person months overall and stratified by demographics (age, sex, ethnicity, region, IMD quintile). To prevent disclosure, all counts in this manuscript are rounded to the nearest 7.
Cox regression
We used Cox regression models to estimate hazard ratios (HRs) and 95%Confidence intervals (CIs) comparing the first fit note rate between the COVID-19 cohorts and the comparator cohorts. All of the COVID-19 cohorts were compared with their contemporary general population cohort as well as the historical 2019 general population cohort. The hospitalised COVID-19 cohorts were compared with the hospitalised pneumonia cohort (2019) only.
We investigated crude univariable and two covariate-adjusted models: (1) adjusted for age (cubic splines with 4 knots) and sex; (2) adjusted for demographics and all clinical characteristics described above. While the same people can contribute person-time to multiple exposure groups, these periods are non-overlapping, so we applied robust standard errors.
To determine if the relative differences in fit note occurrence changed over time, we re-estimated crude and adjusted HRs and 95%CIs censoring follow-up at 30, 90, and 150 days, and examined whether HRs changed according to the duration of follow-up. We chose to use this approach instead of estimating period-specific HRs in order to avoid selection bias due to depletion of susceptibles.(12) Similarly, to explore the impact within demographic categories, we estimated crude and adjusted HRs and 95%CIs stratified by age group, sex, ethnicity, IMD quintile, and region.
Software and reproducibility
Data management was performed using Python 3.8, with analyses carried out using Stata 16.1, R 4.3.0, and Python. Code for data management and analysis as well as all codelists used in this study are available online: https://github.com/opensafely/long-covid-sick-notes.
Patient and public involvement
We have a publicly available website https://opensafely.org/ through which we invite any patient or member of the public to contact us regarding this study or the broader OpenSAFELY project.
Results
Study population description
We identified 365,421 people with a recorded positive SARS-CoV-2 test or COVID-19 diagnosis in 2020, of which 22,015 (6.0%) were hospitalised; 1,206,555 people in 2021 (30,205 [2.5%] hospitalised); and 1,321,313 people in 2022 (34,692 [2.6%] hospitalised) (Table 1). The matched general population cohorts included 3.1 million people in 2019, 3.4 million in 2020, 4.6 million in 2021, and 4.8 million in 2022. For comparison with COVID-19 patients who were hospitalised, we identified 29,673 patients hospitalised with pneumonia in 2019. The majority of people in the COVID-19 cohorts were identified via SGSS testing (88.8% in 2020, 97.8% in 2021 and 96.5% in 2022) (Supplementary Figure 1).
Compared with 2020 and 2021, the 2022 COVID-19 cohort was older and more likely to be female, White ethnicity, and in less deprived IMD quintiles. Hospitalised COVID-19 cohorts tended to be older, and were more likely to be male, non-White ethnicity, and in the most deprived IMD quintile compared with the overall COVID-19 cohorts (Supplementary Table 1). When compared with the pneumonia cohort, hospitalised COVID-19 patients were more likely to be obese,less likely to be current smokers, and had lower rates of many chronic health conditions.
Overall fit note rates
A total of 34,377 (9.4%), 102,949 (8.5%) and 152,859 (11.6%) of people in the COVID-19 cohorts were issued fit notes during follow-up in 2020, 2021 and 2022 (Supplementary Table 2). However, after taking into account differences in follow-up time (Supplementary Figure 2), the fit note rate among the COVID-19 cohorts decreased over time, from 4.88 per 100 person-months in 2020 (95%CI 4.83-4.93), to 2.66 (95%CI 2.64-2.67) in 2021 and 1.73 (95%CI 1.72-1.73) in 2022 (Table 2).
The fit note rate was higher in the hospitalised cohorts, and was 6.78 per 100 person-months (95%CI 6.59-6.98) in 2020, 7.19 (95%CI 7.03 - 7.36) in 2021 and 4.13 (95%CI 4.03-4.22) in 2022. In contrast, the fit note rate in the pneumonia cohort was 7.17 (95%CI 7.01-7.34) (Supplementary Table 3).
Fit note rate by demographics
Among people in the COVID-19 cohorts, the first fit note rate was higher in 2020 and lower in 2022 for all demographic groups except people 18-24 years (Table 2). Generally, the fit note rate was higher in people aged ≥45 years, women, people of Asian or Asian British ethnicity, and lower in people 18-24 years and living in London. The fit note rate increased with greater deprivation as defined by the IMD quintile. In most cases these patterns reflected those in the general population.
A slightly different pattern was observed for the hospitalised cohorts (Supplementary Table 3). People 45-54 years were most likely to receive a fit note, which was older than the pneumonia cohort. Among hospitalised patients the relationship between sex and receiving a fit note differed by year, with women more likely to receive a fit note in 2020 and men more likely in 2021 and 2022. A linear relationship between lesser deprivation and a higher fit note rate was seen for the pneumonia cohort, but not the hospitalised COVID-19 cohorts.
Cox regression by follow-up period
The overall fit note rate was higher in all COVID-19 cohorts than either their contemporary or 2019 cohorts. Most people in the COVID-19 cohorts who were issued a first fit note received it in the first 30 days post-index date (Table 3). The fully adjusted HR representing the average effect over the entire study period was highest comparing the 2020 COVID-19 cohort to the 2020 general population (4.07, 95%CI 4.02-4.12) and lowest comparing the 2022 COVID-19 cohort to the 2022 general population (1.57, 95%CI 1.56-1.58) (Table 3). For all comparisons of the COVID-19 population with the general population, the HR was greatest when we restricted the analysis to the first 30 days from the index date, with fully adjusted HRs in this early follow-up period ranging from 5.94 (95%CI 5.85-6.03) comparing the 2020 COVID-19 cohort to the 2020 general population to 2.37 (95%CI 2.34-2.39) comparing the 2022 COVID-19 cohort to the 2022 general population. With longer follow-up periods, the HR attenuated but remained high.
For the hospitalised cohorts, the overall fit note rate was lower in the COVID-19 cohorts than the pneumonia cohort with the highest rates in the first 30 days. The average HR over all follow-up time ranged from 0.62 (95%CI 0.59-0.65) for the 2022 COVID-19 cohort to 0.81 (95%CI 0.77-0.86) for the 2021 cohort (Supplementary Table 4). The HR was relatively stable regardless of follow-up period used.
Cox regression stratified by demographic categories
For all demographics groups, the HR was greatest in 2020 when compared with either a contemporary or 2019 historical comparator (Figure 1, Supplementary Tables 5-9). In 2020, large relative increases in fit note rate were observed for people ≥35 years, women, and people of Black or Other ethnicity. The HR was also higher in people in less deprived IMD quintiles; when comparing the 2020 COVID-19 cohort to the 2020 general population, the fully adjusted HR ranged from 3.41 (95%CI 3.33-3.48) in the most deprived quintile to 5.15 (95%CI 4.98-5.32) in the least deprived quintile.
These HRs were lower in 2021 compared with 2020, and more so in 2022. The one exception was people 18-24 years, where the HR compared with the general population was 1.47 (95%CI 1.43-1.51) in 2022, compared with 1.24 (95%CI 1.22-1.27) in 2021. Other patterns that differed in 2022 compared with other years was that no meaningful variation by IMD quintile was observed, and a higher HR was seen in men compared with women. The findings were similar comparing the COVID-19 cohorts with the 2019 comparator.
Among the hospitalised cohorts, the HRs comparing them with the pneumonia cohort were similar in 2020 and 2021, but lower in 2022 (Supplementary Figure 3, Supplementary Tables 4-8). A lower fit note rate in the COVID-19 cohorts compared with the pneumonia cohort was observed for most demographic subgroups
Discussion
Summary
We found that people with a recorded positive SARS-CoV-2 test or COVID-19 diagnosis had a higher fit note rate than the general population, even after adjusting for demographics and a wide range of clinical characteristics. This increase was greatest in 2020 but continued even into 2022 despite the introduction of vaccines and other effective outpatient COVID-19 treatments. The fit note rate was highest in the first 30 days post-diagnosis, but an increased risk persisted over all follow-up in all years. These findings provide further evidence for the long-term health and economic impact of COVID-19. In contrast, people hospitalised with COVID-19 were less likely to be issued a fit note than people with pneumonia, especially in 2022, suggesting the long-term effects are not worse than comparable serious respiratory infections requiring hospitalisation.
Strengths and weaknesses
Our data come from a representative sample of 40% of the English population.(13) To better understand fit note patterns, we used COVID-19 cohorts from three different pandemic time periods, with both contemporary and historical comparators, to account for changes in preventive measures (e.g. lockdowns, vaccines) and other disruptions to working patterns (e.g. furlough) that would have impacted on receipt of fit notes over time.
However, there is bias in who gets tested or seeks care for COVID-19 which will have changed as the pandemic evolved,(14) especially after cessation of free testing in April 2022. It is unclear how this would correlate with the likelihood of requesting or requiring a fit note. To mitigate this, we relied on three different methods for identifying COVID-19 cases (positive PCR or LFT test, primary care diagnosis, hospitalisation), but many mild or asymptomatic cases, and cases among people choosing not to get tested or not to record their LFT result will be missed.
We also could not identify whether people were participating in the workforce and included everyone of working age. From 2019 to 2022, the employment rate was 62% in people 18-24 yrs, 85% in people 25-49 years, and 72% in people 50-64 years.(15) It is therefore likely that the true absolute fit note rate will be substantially attenuated, especially among the younger age groups. However, this will only impact on the relative effects if the employment rate differs substantially between the COVID-19 cohorts and general population.
Findings in context
The pandemic has evolved over time, both in the number of cases, disease severity, and demographic groups most affected. The introduction of vaccination programmes and differences in circulating variants lead to changes in severity and transmissibility of SARS-CoV-2.(16) Additionally, the ending of lockdowns and other preventive measures, cessation of free testing, and differential uptake of vaccines by demographic and clinical subgroups will have impacted on the characteristics of people testing positive for SARS-CoV-2.(17) Consistent with previously reported data,(5,18) we observed that the number of people with evidence of a positive SARS-CoV-2 test was much greater in 2021 and 2022 compared with 2020. Thus, while the fit note rate was nearly twice as high in 2020, the actual number of people issued a fit note was 4.5 times higher in 2022.
In contrast, the fit note rate was lower in the general population in 2020, compared with other years. This is consistent with a previous study of NHS workers which identified a decrease in fit notes for non-COVID related episodes in 2020.(19) There are likely to be several explanations such as changes to working patterns, including furlough and remote working, which would have reduced the need for fit notes. There were also fewer circulating non-COVID-19 respiratory infections in 2020 due to public health measures.(20) It is also important to note that people who were required to isolate due to COVID-19 would be issued an isolation note, rather than a fit note and thus are not counted.
Policy implications and interpretation
Now several years into the pandemic, vaccines have led to reduced severity of COVID-19, but the number of daily cases remains high,(17) potentially putting many people at risk of long-term symptoms. The incidence of repeat infections is also increasing;(21) while a repeat infection appears to lead to milder acute symptoms,(22,23) other studies have found that the long-term symptom burden increases with the number of reinfections.(24,25) In our study, we did not try and link the reason for the fit note to COVID-19. However, most studies of long-term effects have focussed on surveys and self-report, which can be unrepresentative(26–28) and long COVID is not well coded in electronic health records.(11) Thus, our findings provide a different perspective on the potential long-term consequences of COVID-19 that does not rely on accurate coding of COVID-19 related symptoms.
Future research
A previous study identified that while the fit note rate decreased overall in 2020, fit notes associated with certain diagnoses, specifically asthma, respiratory conditions, and mental health increased during the COVID-19 pandemic.(19) Similarly, a study of NHS workers found that sickness absences due to psychiatric illness was higher in 2022 compared with pre-pandemic.(29) Further investigation into the duration of fit notes, as well as the diagnoses associated with the issued fit note, whether these changed over time from diagnosis, and how these compare to fit notes issued to people without COVID-19 will help shed light on the most common long-term symptoms experienced.
Conclusion
Despite undercapture of people with COVID-19 and overestimation of the number of people in the workforce, we have identified a considerable increased risk of fit notes among people with COVID-19 compared with the general population. This extends into an era of high vaccination rates and when the severity of illness from COVID-19 is decreasing. The majority of fit notes were issued within the first 30 days post-diagnosis, suggesting that most COVID-19 related sick leave is associated with the acute phase of the disease. However, a persistent increased risk up to 10 months after the illness demonstrates the ongoing health and economic impact of COVID-19.
Data Availability
All data were linked, stored and analysed securely using the OpenSAFELY platform, https://www.opensafely.org/, as part of the NHS England OpenSAFELY COVID-19 service. Data include pseudonymised data such as coded diagnoses, medications and physiological parameters. No free text data are included. All code is shared openly for review and re-use under MIT open license (https://github.com/opensafely/long-covid-sick-notes). Detailed pseudonymised patient data are potentially re-identifiable and therefore not shared.
Ethics approval
This study was approved by the Health Research Authority (Research Ethics Committee reference 20/LO/0651) and by the London School of Hygiene and Tropical Medicine Ethics Board (reference 21863).
Members of OpenSAFELY Collaborative
Sebastian CJ Bacon, Lucy Bridges, Benjamin FC Butler-Cole, Simon Davy, Iain Dillingham, David Evans, Ben Goldacre, Liam Hart, George Hickman, Peter Inglesby, Steven Maude, Jessica Morley, Amir Mehrkar, Thomas O’Dwyer, Rebecca M Smith, Tom Ward, Jon Massey, Milan Wiedemann, Christopher Bates, Jonathan Cockburn, Sam Harper, Frank Hester, John Parry
Members of National Core Studies Collaborative
Nishi Chaturvedi, Chloe Park, Alisia Carnemolla, Dylan Williams, Anika Knueppel, Andy Boyd, Emma L Turner, Katharine M Evans, Richard Thomas, Samantha Berman, Stela McLachlan, Matthew Crane, Rebecca Whitehorn, Jacqui Oakley, Diane Foster, Hannah Woodward, Kirsteen C Campbell, Nicholas Timpson, Alex Kwong, Ana Goncalves Soares, Gareth Griffith, Renin Toms, Louise Jones, Annie Herbert, Ruth Mitchell, Tom Palmer, Jonathan Sterne, Venexia Walker, Lizzie Huntley, Laura Fox, Rachel Denholm, Rochelle Knight, Kate, Northstone, Arun, Kanagaratnam, Elsie Horne, Harriet Forbes, Teri North, Kurt, Taylor, Marwa AL Arab, Scott Walker, Jose IC Coronado, Arun S Karthikeyan, George, Ploubidis, Bettina Moltrecht, Charlotte Booth, Sam Parsons, Bozena Wielgoszewska, Charis Bridger-Staatz, Claire Steves, Ellen Thompson, Paz Garcia, Nathan Cheetham, Ruth Bowyer, Maxim Freydin, Amy Roberts, Ben Goldacre, Alex Walker, Jess Morley, William Hulme, Linda Nab, Louis Fisher, Brian MacKenna, Colm Andrews, Helen Curtis, Lisa Hopcroft, Amelia Green, Praveetha Patalay, Jane Maddock, Kishan Patel, Jean Stafford, Wels Jacques, Kate Tilling, John Macleod, Eoin McElroy, Anoop Shah Richard Silverwood, Spiros Denaxas, Robin Flaig, Daniel McCartney, Archie Campbell, Laurie Tomlinson, John Tazare, Bang Zheng, Liam Smeeth, Emily Herrett, Thomas Cowling, Kate Mansfield, Ruth E Costello, Kevin Wang, Kathryn Mansfield, Viyaasan Mahalingasivam, Ian Douglas, Sinead Langan, Sinead Brophy, Michael Parker, Jonathan Kennedy, Rosie McEachan, John Wright, Kathryn Willan, Ellena Badrick, Gillian Santorelli, Tiffany Yang, Bo Hou, Andrew Steptoe, Giorgio Di Gessa, Jingmin Zhu, Paola Zaninotto, Angela Wood, Genevieve Cezard, Samantha Ip, Tom Bolton, Alexia Sampri, Elena Rafeti, Fatima Almaghrabi, Aziz Sheikh, Syed A Shah, Vittal Katikireddi, Richard Shaw, Olivia Hamilton, Michael Green, Theocharis Kromydas, Daniel Kopasker, Felix Greaves, Robert Willans, Fiona Glen, Steve Sharp, Alun Hughes, Andrew Wong, Lee Hamill Howes, Alicja Rapala, Lidia Nigrelli, Fintan McArdle, Chelsea Beckford, Betty Raman, Richard Dobson, Amos Folarin, Callum Stewart, Yatharth Ranjan, Jd Carpentieri, Laura Sheard, Chao Fang, Sarah Baz, Andy Gibson, John Kellas, Stefan Neubauer, Stefan Piechnik, Elena Lukaschuk, Laura C Saunders, James M Wild, Stephen Smith, Peter Jezzard, Elizabeth Tunnicliffe, Zeena-Britt Sanders, Lucy Finnigan, Vanessa Ferreira, Mark Green, Rebecca Rhead, Milla Kibble, Yinghui Wei, Agnieszka Lemanska, Francisco Perez-Reche, Dominik Piehlmaier, Lucy Teece, Edward Parker
Administrative
Conflicts of interest
Over the past five years BG has received research funding from the Laura and John Arnold Foundation, the NHS National Institute for Health Research (NIHR), the NIHR School of Primary Care Research, the NIHR Oxford Biomedical Research Centre, the Mohn-Westlake Foundation, NIHR Applied Research Collaboration Oxford and Thames Valley, the Wellcome Trust, the Good Thinking Foundation, Health Data Research UK (HDRUK), the Health Foundation, and the World Health Organisation; he also receives personal income from speaking and writing for lay audiences on the misuse of science. CB is an employee of TPP. BMK is also employed by NHS England working on medicines policy and clinical lead for primary care medicines data.
Funding
The OpenSAFELY Platform is supported by grants from the Wellcome Trust (222097/Z/20/Z) and MRC (MR/V015757/1, MC_PC-20059, MR/W016729/1). In addition, development of OpenSAFELY has been funded by the Longitudinal Health and Wellbeing strand of the National Core Studies programme (MC_PC_20030: MC_PC_20059), The NIHR funded CONVALESCENCE programme (COV-LT-0009), NIHR (NIHR135559, COV-LT2-0073), and the Data and Connectivity National Core Study funded by UK Research and Innovation (MC_PC_20058), and Health Data Research UK (HDRUK2021.000, 2021.0157).
The views expressed are those of the authors and not necessarily those of the NIHR, NHS England, Public Health England or the Department of Health and Social Care. Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.
Information governance and ethical approval
NHS England is the data controller of the NHS England OpenSAFELY COVID-19 Service; TPP is the data processor; all study authors using OpenSAFELY have the approval of NHS England.(1) This implementation of OpenSAFELY is hosted within the TPP environment which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant.(2)
Patient data has been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the NHS England OpenSAFELY COVID-19 service is via a virtual private network (VPN) connection; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.(3)
The service adheres to the obligations of the UK General Data Protection Regulation (UK GDPR) and the Data Protection Act 2018. The service previously operated under notices initially issued in February 2020 by the the Secretary of State under Regulation 3(4) of the Health Service (Control of Patient Information) Regulations 2002 (COPI Regulations), which required organisations to process confidential patient information for COVID-19 purposes; this set aside the requirement for patient consent.(4) As of 1 July 2023, the Secretary of State has requested that NHS England continue to operate the Service under the COVID-19 Directions 2020.(5) In some cases of data sharing, the common law duty of confidence is met using, for example, patient consent or support from the Health Research Authority Confidentiality Advisory Group.(6)
Taken together, these provide the legal bases to link patient datasets using the service. GP practices, which provide access to the primary care data, are required to share relevant health information to support the public health response to the pandemic, and have been informed of how the service operates.
NHS Digital. The NHS England OpenSAFELY COVID-19 service - privacy notice [Internet]. 2023 [cited 2023 Jul 5]. Available from: https://digital.nhs.uk/coronavirus/coronavirus-covid-19-response-information-governance-hub/the-nhs-england-opensafely-covid-19-service-privacy-notice
NHS Digital. NHS Digital. 2023 [cited 2023 Jul 5]. Data Security and Protection Toolkit. Available from: https://digital.nhs.uk/data-and-information/looking-after-information/data-security-and-information-governance/data-security-and-protection-toolkit
NHS Digital [Internet]. [cited 2023 Mar 6]. ISB1523: Anonymisation Standard for Publishing Health and Social Care Data. Available from: https://digital.nhs.uk/data-and-information/information-standards/information-standards-and-data-collections-including-extractions/publications-and-notifications/standards-and-collections/isb1523-anonymisation-standard-for-publishing-health-and-social-care-data
UK Department of Health and Social Care. GOV.UK. 2022 [cited 2023 Jul 5]. [Withdrawn] Coronavirus (COVID-19): notice under regulation 3(4) of the Health Service (Control of Patient Information) Regulations 2002 – general. Available from: https://www.gov.uk/government/publications/coronavirus-covid-19-notification-of-data-controllers-to-share-information/coronavirus-covid-19-notice-under-regulation-34-of-the-health-service-control-of-patient-information-regulations-2002-general--2
NHS Digital. NHS Digital. 2022 [cited 2023 Jul 5]. Secretary of State for Health and Social Care: COVID-19 Public Health Directions 2020. Available from: https://digital.nhs.uk/about-nhs-digital/corporate-information-and-documents/directions-and-data-provision-notices/secretary-of-state-directions/covid-19-public-health-directions-2020
NHS Health Research Authority. Health Research Authority. [cited 2023 Jul 5]. Confidentiality Advisory Group. Available from: https://www.hra.nhs.uk/about-us/committees-and-services/confidentiality-advisory-group/
Data access and verification
Access to the underlying identifiable and potentially re-identifiable pseudonymised electronic health record data is tightly governed by various legislative and regulatory frameworks, and restricted by best practice. The data in OpenSAFELY is drawn from General Practice data across England where TPP is the Data Processor. TPP developers (CB, JC, JP, FH, and SH) initiate an automated process to create pseudonymised records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These are linked onto key external data resources that have also been pseudonymised via SHA-512 one-way hashing of NHS numbers using a shared salt. DataLab developers and PIs holding contracts with NHS England have access to the OpenSAFELY pseudonymised data tables as needed to develop the OpenSAFELY tools. These tools in turn enable researchers with OpenSAFELY Data Access Agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymised patient data, and to review the outputs of this code. All code for the full data management pipeline—from raw data to completed results for this analysis—and for the OpenSAFELY platform as a whole is available for review at github.com/OpenSAFELY.
Author contributions
Conceptualization: AJW, LAT, JM, FG, SD; Data curation: ALS, RYP, AJW, SCJB, ID, CB, BG; Formal analysis: ALS, RYP, AJW, JT; Funding acquisition: BG; Investigation: ALS, AJW, RYP; Methodology: ALS, AJW, LAT, RYP, JM, FG, JT, KB; Project administration: AJW, LAT, AM, BG; Resources: AJW, ID, SCJB, BG; Software: ALS, RYP, AJW, JT, SCJB, ID, CB; Supervision: AJW, LAT, BMK, AM, BG; Validation: ALS, RYP, AJW; Visualisation: ALS, RYP, AJW; Writing - original draft: ALS, RYP, AJW; Writing - review & editing: All authors. All authors gave final approval of the version to be published and agree to be accountable for all aspects of the work. All authors confirm that they had full access to all the data in the study and accept responsibility to submit for publication.
Supplementary Files
Acknowledgements
We are very grateful for all the support received from the TPP Technical Operations team throughout this work, and for generous assistance from the information governance and database teams at NHS England and the NHS England Transformation Directorate
Footnotes
↵^Members of the collaboratives are listed in the Acknowledgements
- Reorder author list