Abstract
Seasonal variation in the age distribution of influenza A cases suggests that factors other than age shape susceptibility to infection. We ask whether these differences can be explained in part by protection conferred by childhood influenza infection, which has lasting impacts on immune responses to influenza and protection against novel influenza A subtypes (phenomena known as original antigenic sin and immune imprinting). Fitting a statistical model to data from studies of influenza vaccine effectiveness (VE), we find that primary infection appears to reduce the risk of medically attended infection with that subtype throughout life. This effect is stronger for H1N1 compared to H3N2. Additionally, we find evidence that influenza VE varies with both age and birth year, but not with the imprinting subtype, indicating that VE may be sensitive to particular exposure histories. The ability to predict age-specific risk might improve forecasting models and vaccination strategies to combat seasonal influenza.
Introduction
Seasonal influenza is a serious public health concern, resulting in over 100,000 hospitalizations and 4,000 deaths per year in the United States despite extensive annual vaccination campaigns (Reed et al., 2015). The rapid evolution of the virus to escape preexisting immunity contributes to the relatively high incidence of influenza, including in previously infected older children and adults. How susceptibility arises and changes over time in the host population has been difficult to quantify.
A pathogen’s rate of antigenic evolution should affect the mean age of the hosts it infects, and differences in the rate of antigenic evolution have been proposed to explain differences in the age distributions of the two subtypes of influenza A. Compared to H3N2, H1N1 disproportionately infects children (Caini et al., 2018; Khiabanian et al., 2009). It also evolves antigenically more slowly (Bedford et al., 2015). Thus, compared to H3N2, H1N1 is slower to escape immunity in individuals who have experienced prior infection (namely older children and adults), making them less susceptible to reinfection (Bedford et al., 2015; Beauté et al., 2015; Caini et al., 2018; Khiabanian et al., 2009). H3N2, in contrast, exhibits well known changes in antigenic phenotype that are expected to drive cases toward adults (Smith et al., 2004; Cobey and Koelle, 2008). Under this simple model, hosts previously infected with a subtype face equal risk of reinfection (on challenge) with an antigenic variant of that subtype.
The age distributions of influenza infections in exceptional circumstances—pandemics and spillovers of avian influenza— have shown unexpected variation that suggests potentially complex effects of prior infection. Excess mortality in some adult cohorts during the 1918 and 2009 H1N1 pandemics has been linked to childhood infection with particular subtypes (Gagnon et al., 2013; Worobey et al., 2014; Gagnon et al., 2018). In the post-2009 pandemic period, excess mortality and hospitalization in these cohorts was observed during H1N1-dominated seasons (Budd et al., 2019). Similarly, the subtypes circulating in childhood predict individuals’ susceptibility to severe zoonotic infections with avian H5N1 and H7N9, regardless of later exposure to other seasonal subtypes (Gostic et al., 2016). These patterns suggest that early influenza infections, and not prior infection per se, strongly shape susceptibility.
Early infections might also affect the protection conferred by influenza vaccination. Foundational work on the theory of original antigenic sin demonstrated that an individual’s immune response to influenza vaccination is biased toward antigens similar to those encountered in childhood (Davenport and Hennessy, 1956, 1957). In some cases, this may result in an extremely narrow antibody response focused on a single epitope (Davis et al., 2018). This phenomenon has been suggested to explain an unexpected decrease in vaccine effectiveness (VE) in the middle-aged in the 2015-2016 influenza season (Skowronski et al., 2017b; Flannery et al., 2018). More generally, it has been hypothesized that biases in immune memory can arise from both past infections and vaccinations, leading to variation in vaccine effectiveness that is sensitive to the precise history of exposures (Smith et al., 1999; Skowronski et al., 2017a).
To measure the effect of early exposures on infection risk and VE, we fitted statistical models to 3493 influenza cases identified through seasonal studies of influenza VE from the 2007-2008 to 2017-2018 seasons in Marshfield, Wisconsin (Belongia et al., 2009, 2011; Griffin et al., 2011; Treanor et al., 2012; Ohmit et al., 2014; McLean et al., 2014b; Gaglani et al., 2016; Zimmerman et al., 2016; Jackson et al., 2017; Flannery et al., 2018). Each season, individuals in a defined community cohort were recruited and tested for influenza when seeking outpatient care for acute respiratory infection. Eligibility was restricted to individuals >6 months of age living near Marshfield who received routine care from the Marshfield Clinic. After obtaining informed consent, a mid-turbinate swab was obtained for influenza detection. RT-PCR was performed using CDC primers and probes to identify influenza cases, including type and subtype.
We sought to explain the variation in the age distribution of these cases by subtype and over time. Our model predicted the relative number of cases of influenza in each birth year each season as a function of the age structure of the population, age-specific differences in the risk of medically attended influenza A infection, early influenza infection, and vaccination. Despite the extensive antigenic evolution in both subtypes over the study period, we found strong evidence of protection from the subtype to which a birth cohort was likely first infected (the imprinting subtype) and variation in VE by birth cohort.
Results
The age distribution of cases varies significantly between seasons and subtypes
We examined the age distribution of cases of the dominant (most common) subtype in each season between 2007-2008 and 2017-2018 among enrolled patients. We excluded the subdominant subtype in each season due to concerns that short-term interference between the subtypes (Laurie et al., 2015; Goldstein et al., 2011) would disproportionately affect the age distribution of the rarer subtype. Differences between all pairs of seasons were evaluated by the G-test of independence and corrected for multiple tests (Materials and Methods: “Calculating differences in the age distribution between seasons”).
The age distribution of cases varies significantly between subtypes. The relative burden of cases is consistently higher in people over 65 years old during H3N2-dominated seasons compared to H1N1-dominated seasons (Figure 1), and nearly all H1N1-dominated seasons have significantly different age distributions from all H3N2-dominated seasons (Figure 1-Supplement 1, off-diagonal quadrants).
The age distribution also varies significantly within subtypes over time (Figure 1-Supplement 1, diagonal quadrants). The seven H3N2-dominated seasons display three types of age distributions (Figure 1-Supplement 1, white patches in upper left-hand quadrant), and two correspond to major antigenic clusters (2007-2008 Fonville et al., 2015, 2010-2012 Ann et al., 2012). These differences sometimes coincide with significant shifts in the age distribution between seasons. For instance, the highest fraction of H3N2 cases occurs in 20-29 year olds in the 2007-2008 season, but this age group has the lowest fraction of cases in the next H3N2-dominated season (2010-2011, Figure 1C). In H1N1, the shift from seasonal to pandemic strains is associated with a significant change in the age distribution (Figure 1-Supplement 1, lower right-hand quadrant). The high fraction of cases among 40-64 year-olds in the 2013-2014 season (Figure 1B) has been attributed to the emergence of strains to which this group was especially susceptible (Linderman et al., 2014; Petrie et al., 2016).
We found further evidence that age groups differed in their susceptibility across seasons by examining the relative risk of infection during the first versus second half of each epidemic period (Materials and Methods: “Calculating relative risk”). Because more susceptible populations experience higher attack rates, individuals in these populations should be infected disproportionately early rather than late in an epidemic (Worby et al., 2015). We confirmed that an age group’s relative risk of infection in the first versus the second half of each epidemic correlates with the total fraction of cases in that age group that season (Spearman’s p=0.41, p=0.001, Figure 1-Supplement 2A). This trend is significant for H1N1 (Spearman’s p=0.47, p=0.02, Figure 1-Supplement 2A) and H3N2 seasons separately (Spearman’s p=0.35, p=0.05, Figure 1-Supplement 2A). The positive correlation in all seasons is robust to undersampling of cases at the start or end of specific seasons (Materials and Methods: “Sensitivity to sampling effort”, Figure 1-Supplement 2B). This provides supporting evidence that the different numbers of cases in each age group reflect underlying differences in susceptibility.
Just as the age distribution of cases varies over time, the age groups with the highest relative risk of infection, and by implication susceptibility, also change across seasons. For instance, 5-17 year olds had the highest relative risk of early infection in the 2008-2009 season, whereas 50-64 year-olds had the highest relative risk in the 2013-2014 season (Figure 1-Supplement 3). Relative risk in Marshfield is considerably more variable than national estimates, which showed that 5-17 year-olds had the highest relative risk in all but one season from the 2009 pandemic to 2013-2014 (Worby et al., 2015). These differences may be due in part to the fact that our measurements of relative risk used outpatient visits, whereas the national estimates used hospitalizations.
Taken together, these findings suggest that the risk of influenza infection is not a simple function of age alone. Other factors, such as past influenza infections and vaccination, might explain the changing age distributions of cases in time.
Imprinting probabilities of age groups change over time
We hypothesized that variation in the age distribution of cases could be explained by the aging of birth cohorts with similar early exposure histories. This would cause the early exposure history of an age group to change in time. To calculate the probability that an individual in a particular age group had their first influenza A infection with a particular subtype, we adapted the approach from Gostic et al., 2016. Briefly, we calculated the probability that an individual born in a specific year had their first infection with H1N1, H2N2, or H3N2 using data on relative epidemic sizes and the frequencies of circulating subtypes (Figure 2-Supplement 1).
As expected, age groups’ early exposures are not static and change over time (Figure 2). Older people nonetheless tend to be imprinted to H1N1 or H2N2, whereas younger people have higher probabilities of imprinting to H3N2. The effects of the 2009 H1N1 pandemic are evident in the three youngest age groups as a transient increase (from 2009 to approximately 2013) in their H1N1 imprinting probability.
Modeling approach
We fitted a set of models to estimate the effects of demography, age, imprinting, and vaccination on the age distribution of influenza cases. The number of observed cases in influenza season t among people born in year y is proportional to a combination of these factors:
Demography. The age distribution of our study cohort is not static over the study period. All models adjust for the changing fractions of the population in each birth cohort and season (Figure 3-Supplement 1, Materials and Methods: “Demography”).
Age-specific effects. We consider that age itself may be associated with differences in influenza A infection risk stemming from differences in susceptibility and/or rates of contact with infectious people. Additionally, we expect that age groups may intrinsically differ in their healthcare-seeking behaviors. These factors are inseparable in our data, and all models represent their combined effects with a static age-specific parameter shared by both subtypes that describes the risk of age-specific medically attended influenza A infection (Materials and Methods: “Age-specific factors”). Thus, we assume no intrinsic differences in the age-specific virulence of the two subtypes. These age-specific parameters are fitted. We also adjust for other potential sources of age-specific bias, including age-specific differences in study approachment and enrollment rates (Materials and Methods: “Age-specific factors”).
Imprinting. We tested several hypotheses of how primary exposures could affect the risk of infection with H1N1 and H3N2. In each version, we estimated fractional reductions in risk to H1N1 and H3N2 due to primary (i.e., imprinting) exposure to the same type:
Subtype-specific imprinting: Influenza has two main antigens, hemagglutinin (HA) and neuraminadase (NA). Imprinting could in theory derive from responses to either or both antigens. Because H1N1 is the only seasonal subtype of influenza with N1, we cannot separate the effects of initial N1 exposure from initial H1 exposure. However, since N2 appears in both H3N2 and H2N2 viruses, we can estimate protection against H3N2 infection from initial N2 exposure separately from protection from initial H3 exposure (Materials and Methods: “HA subtype imprinting” and “N2 imprinting”).
Group-level imprinting: Influenza A viruses fall into two groups (I and II) corresponding to the two phylogenetic clades of HA. Gostic et al., 2016 found that primary infection by a virus belonging to one group protected against severe infection by another subtype in the same group. If group-level imprinting were influential, we would see primary infection with H2N2 conferring protection against H1N1, another group I virus, as well as H1N1 protecting against H1N1 and H3N2 against H3N2. We considered a separate class of models that assumes group-level protection instead of subtype-specific protection (Materials and Methods: “HA group imprinting”).
Vaccination. Approximately 45% of the population of Marshfield is vaccinated against influenza each year. We estimated cases in vaccinated and unvaccinated individuals of each birth year separately. Naively, we expect that vaccinated individuals should seek medical attention for acute respiratory infection (ARI) proportionally to the fraction of their cohort vaccinated that season. However, vaccinated individuals may seek medical attention for ARI more frequently than expected due to positive associations between the decision to vaccinate, healthcare-seeking behavior, and underlying medical conditions (Jackson et al., 2005a,b; Belongia et al., 2009). We attempted to adjust for this by calculating the fraction of vaccinated people among those who had a medically attended acute respiratory infection (MAARI) and tested negative for influenza (i.e., the test-negative controls, Materials and Methods: “Vaccination”). We find that this correlates with but exceeds vaccination coverage for most age groups, suggesting vaccinated individuals are overrepresented among cases for reasons unrelated to influenza (Figure 3-Supplement 2). We also assume that vaccination is not perfectly effective, defining VE as the fractional reduction in cases expected in vaccinated compared to unvaccinated individuals after controlling for the effects described above. We estimated subtype-specific VE under five scenarios: (i) constant across age groups and seasons; (ii) season-specific and constant across age groups; (iii) age-specific and constant across seasons; (iv) imprinting-specific; and (v) birth-cohort-specific. We assumed that vaccination affects risk only in the current season, i.e., there are no residual effects from prior vaccination (Materials and Methods: “Vaccination”).
With these considerations, we evaluated the models by maximum likelihood and compared their performance using the corrected Akaike information criterion (cAIC, Figure 3).
Age-specific differences in medically attended influenza A infection risk affect epidemic patterns
As expected, the cases reveal age-specific differences in the risk of medically attended influenza A infection. (Figure 4; Figure 4-Supplement 1; Appendix 1 Table 1). The risk of medically attended influenza A infection is roughly threefold higher among children less than four years old compared to adults 20-29 years old, after adjusting for other effects (Figure 4). This decline in risk with age is consistent with findings that attack rates decrease with age (Monto et al., 1985; Bodewes et al., 2011; Wu et al., 2010, 2017; Huang et al., 2019). Additionally, rates of healthcare-seeking behavior have been shown to decline with age before rising in adults over 65 years old (Biggerstaff et al., 2014; Brooks-Pollock et al., 2011; Van Cauteren et al., 2012), consistent with our results. Finally, the increased risk of medically attended influenza A infection among people ≥. 65 years old relative to other adults may be related to the increasing prevalence of high-risk medical conditions with age (Figure 4-Supplement 2).
Initial infection confers long-lasting, subtype-specific protection against future infection
Our best-fitting model supports subtype-specific imprinting for H1N1 and H3N2 (Figure 5, top row; Appendix 1 Table 1). The risk of future medically attended infection by H1N1 is reduced by 66% (95% CI 53-77%) in people imprinted to H1N1, whereas the risk of future medically attended infection by H3N2 is reduced by 33% (95% CI 17-46%) in people imprinted to H3N2. We found no evidence of a protective effect from imprinting to N2 (0%, 95% CI 0-7%). Our estimates of imprinting protection are insensitive to our choice of age groups for medically attended influenza A infection risk and VE (Appendix 1 Table 3) as well as undersampling of influenza cases in some seasons (Figure 5-Supplement 1, Figure 5-Supplement 2, Materials and Methods, “Sensitivity to sampling effort”).
We also tested whether vaccination is a plausible mechanism of imprinting (Figure 5-Supplement 3, Materials and Methods, “Calculating imprinting probabilities”) and found that primary exposure via vaccination provided similar protection as imprinting from primary infection (100% of the effect of primary infection, 95% CI 74-100%, Figure 5-Supplement 4, Materials and Methods, “Vaccine imprinting”). However, the estimated protection could be confounded with residual protection from prior season vaccination (Ohmit et al., 2014; McLean et al., 2014a,b) which the model excludes.
In theory, the protective effects of imprinting that we measured could be influenced by cross-protection rather than the impact of first infection per se. Because first infections are also recent infections in children, we reasoned that the observed imprinting effects might arise from confounding with recent infections in these ages. When we excluded the youngest age groups, our estimates of H1N1 imprinting protection decreased while H3N2 imprinting protection increased (Figure 5, second row). However, initial infection by H1N1 was still more protective than initial infection by H3N2, both imprinting effects remained significantly positive, and there was no significant change in the values of other estimated parameters (Appendix 1 Table 1 and Table 2).
We expect that confounding with recent infection should also manifest in the difference between the observed and estimated number of cases (i.e., the excess cases, Materials and Methods: “Calculating excess cases”), since our model does not take prior season infections into account when estimating cases for the current season. More infections within a population in one season should reduce susceptibility in that population at the start of the next season. We thus expect that a large number of excess cases in one season will be followed by a small number of excess cases in the next season with the same dominant subtype (i.e., a negative correlation). Instead, we observed that excess cases for each birth cohort have a weak positive correlation from season to season, suggesting that immunity from recent infections is not a primary driver of variation in the age distribution of cases (Figure 5-Supplement 5).
Since older adults have the highest probability of primary infection with H1N1, we also reasoned that older adults might disproportionately drive the strong protection from H1N1 imprinting we observe. People born before 1947 were likely exposed to H1N1 strains that are antigenically similar to the post-pandemic H1N1 strains that comprise most of our H1N1 infection data (Manicassamy et al., 2010; O’Donnell et al., 2012), creating the possibility that strain-specific cross-immunity drives the pattern we attribute to subtype-specific imprinting. Excluding the oldest adults, however, does not significantly change our estimates of imprinting protection or other parameters (Figure 5, third row, Appendix 1 Table 1, Table 2). When we exclude both the youngest and oldest age groups, initial infections by H1N1 and H3N2 have similar protective effects (Figure 5, bottom row). This shows that the combined effects of cross-protection in both the youngest and oldest individuals contribute to the signal of imprinting protection we observe, but they are not its sole drivers.
VE varies by birth cohort in older children and adults
The best-fitting model includes age-specific VE (Figure 4-Supplement 1, Appendix 1 Table 2). While serological responses to influenza vaccination are weakest in the young (Englund et al., 2005; Neuzil et al., 2006) and old (Lee et al., 2018; DiazGranados et al., 2014), it is unclear what age-related factors would drive variation in VE in other age groups. Wehypothesized that VE in these ages is specific to exposure history, which correlates with birth year, rather than age.
To test this hypothesis, we fitted a model with birth-cohort-specific VE to data excluding either children <10 years old or adults ≥. 65 years old. We chose birth cohorts that corresponded to the age groups of the original model in 2017-2018 (Materials and Methods: “Vaccination”), keeping the number of parameters the same (e.g., VE in the 20-29 age group became VE in the 1988-1997 birth year cohort). We find that age-specific VE still outperforms all other models after we exclude the oldest age group (≥.65 years old). In contrast, birth-cohort-specific VE performs better when we exclude children <10 years old (Figure 6-Supplement 1). Estimates of imprinting protection and age-specific risk of medically attended influenza in the birth-cohort-specific VE models are not significantly different from estimates from the best-fitting model fitted to all ages (Appendix 1 Table 1). Taken together, these results suggest that birth-cohort-specific VE best explains the case distribution in older children and adults, who have likely experienced their first influenza infection, whereas age-specific VE best explains cases in younger children, who have less influenza exposure.
VE differs between birth cohorts that have similar imprinting by subtype (Figure 6, Appendix 1 Table 4), suggesting that specific infection history (beyond imprinting subtype) is important. For example, the 1968-1977 and 1988-1997 cohorts have similar probabilities of primary exposure to H1N1 and H3N2, but they differ substantially in their VE to both subtypes (Figure 6). The 1988-1997 and 1998-2002 cohorts also have similar probabilities of primary exposure to each subtype and have similar H1N1 VEs, but have significantly different H3N2 VEs (Figure 6). Antigenic differences within each subtype might explain this variation.
Our results support the idea that biases in immune memory from early exposures (i.e., original antigenic sin; Davenport and Hennessy, 1957; Francis, 1960; Groth and Webster, 1966) influence VE. The model with birth-cohort-specific VE better estimates cases among vaccinated 50-64 year-olds (born 1953-1967) in the 2015-2016 season than the model with age-specific VE (Figure 6-Supplement 2, Materials and Methods: “Calculating excess cases”). Reduced VE in this age group has been attributed to the exacerbation of antigenic mismatch by the vaccine in adults whose antibody responses were focused on a non-protective site (Skowronski et al., 2017b; Flannery et al., 2018). The improved performance of birth-cohort-specific VE relative to age-specific VE suggests other seasons and age groups where original antigenic sin might have influenced VE, such as 20-29 year-olds in the 2007-2008 influenza season.
Discrepancies partly explained by antigenic evolution
The best-fitting model accurately reproduces the age distributions of vaccinated and unvaccinated cases of each subtype, aggregated across seasons (Figure 7A). The only exception is that it underestimates H1N1 cases in unvaccinated 5-9 year-olds. By examining the differences between predicted and observed cases for each season, we see that this is largely driven by infection during the 2009 H1N1 pandemic (Figure 7B). Such a large antigenic change may have negated any protection from previous infection in 5-9 year-olds and made them particularly susceptible to pandemic infection.
The model underestimates cases in unvaccinated individuals >30 years old in the 2013-2014 season. This is further evidence that subtype-specific imprinting cannot explain all age variation. As mentioned before, this season provided one of the first examples that original antigenic sin could affect protection: middle-aged adults had been targeting a familiar site on the pandemic strain that then mutated; other age groups were effectively blind to these changes, owing to their different exposure histories (Linderman et al., 2014; Huang et al., 2015; Arriola et al., 2014; Dávila et al., 2014; Petrie et al., 2016).
Discussion
The distribution of influenza cases by birth year is consistent with subtype-level imprinting, whereby initial infection with a subtype protects against future infections by the same subtype. The stronger protective effect observed for primary H1N1 infection compared to primary H3N2 infection may be caused by greater cross-protective responses to conserved epitopes. This is in line with previous work modeling antibody titer dynamics that showed that protection conferred by H1N1 infection is longer-lasting than protection conferred by H3N2 infection (Ranjeva et al., 2019). Subtype-specific protection is more specific than the previously reported group-level imprinting (Gostic et al., 2016) but clearly arises from primary infection rather than any prior exposure.
In contrast to the clear role of the imprinting subtype in protection from infection, the model implicates the imprinting strain or other attributes of exposure history in VE. Birth-cohort-specific VE predicts the distribution of cases in older children and adults better than age-specific or imprinting-subtype-specific VE. Although seasonal estimates of VE routinely stratify by age, shifts in VE from one season to the next might be easier to interpret in light of infection history (e.g., Skowronski et al., 2017b; Flannery et al., 2018). The results suggest this effect may be complex, i.e., influenced by strains’ specific identities rather than merely their subtype. Our model cannot distinguish between the possibility that the precise identity of the imprinting strain primarily determines later VE, or if individuals’ responses to vaccination are shaped by a particular succession of exposures, which will be common to others in the same birth cohort. Regardless, variation in VE between birth cohorts appears substantial and suggests a role for past exposure in the effectiveness of vaccination. This presents a challenge for the improvement of vaccination strategies (Erbelding et al., 2018).
Biases associated with our methodology and the vaccination history of our study population may confound our estimates of VE. Potential selection and misclassification biases are associated with studies that use influenza test-negative controls to control for differences in healthcare-seeking behavior (Lewnard et al., 2018; Sullivan et al., 2016). Because we also use test-negative controls to set our null expectation for the distribution of cases among birth cohorts, our VE estimates are subject to these biases as well. Moreover, our study population is heavily vaccinated, and the most participants are frequent vaccinees (Figure 3-Supplement 3). Frequent vaccination has been associated with reduced VE (McLean et al., 2014b; Saito et al., 2018; Skowronski et al., 2016). Therefore, the model may underestimate VE in less vaccinated populations. We observed an unusually high H1N1 VE in the 2003-2006 birth cohort. Because we restricted cases in this analysis to people ≥.10 years old, this VE estimate included data from only the 2013-2014 and 2015-2016 influenza seasons. No H1N1 cases among vaccinated or unvaccinated individuals were observed in this birth cohort for those seasons, which in turn led to this high estimate of H1N1 VE. To reduce stochastic effects, our estimates are worth repeating in a larger population.
Incorporating differences in susceptibility based on exposure history might improve methods to forecast influenza seasons. Our analysis of the relative risk of infection during the first half of each season shows more variation in the most susceptible age groups from season to season than previously estimated (Worby et al., 2015). While the smaller sample sizes in Marshfield compared to national data create uncertainty in our estimates, the correlation between the relative risk and total fraction of cases indicates that the age groups driving epidemics change from season to season. As our results show, these differences in susceptibility may derive from differences in exposure history. Therefore, incorporating information on exposure history into epidemic models may allow for more accurate identification of at-risk populations.
While the rate of antigenic evolution affects the rate at which different populations become susceptible to infection, the heterogeneity in susceptibility we observe here may also drive antigenic evolution. This heterogeneity in susceptibility implies that influenza viruses face different selective pressures in groups with different exposure histories (Cobey and Hensley, 2017). Recent research consistent with this hypothesis has shown that sera isolated from different individuals can select for distinct influenza escape mutants (Lee et al., 2019). More careful study of how immune memory to influenza evolves from infection and vaccination might improve understanding of influenza’s evolution.
Materials and Methods
Study cohort
Cases of PCR-confirmed, medically attended influenza were identified from annual community cohorts based on residency in the Marshfield Epidemiologic Study Area (MESA) in central Wisconsin. MESA is a 14-ZIP-code geographic area surrounding Marshfield, Wisconsin, where nearly all residents receive outpatient and inpatient care from the Marshfield Clinic Health System. For each influenza season from 2007-2008 through 2017-2018, we identified a subset of MESA residents >6 months of age who received routine care from the Marshfield Clinic. These individuals were eligible for recruitment into a VE study if they sought care for acute respiratory illness during each influenza season. Most patients with MAARI were recruited in the outpatient setting, but inpatient recruitment also occurred in 2007-08 and 2008-09. Recruitment occurred in primary care departments, including urgent care, pediatrics, combined internal medicine and pediatrics, internal medicine, and family practice. The proportion of patients with MAARI who were screened for enrollment varied by season. We excluded patients recruited in an inpatient (hospital) setting.
Each season, recruitment began when influenza activity was detected in the community and usually continued for 12-15 weeks. Symptom eligibility criteria varied by season but included fever/feverishness or cough during most seasons. We retroactively standardized symptom eligibility criteria to only require cough as a symptom. Individuals with illness duration >7 days were excluded. After obtaining informed consent, a mid-turbinate swab was obtained for influenza detection. RT-PCR was performed using CDC primers and probes to identify influenza cases, including type and subtype.
The Marshfield Clinic generally does not capture MAARI in nursing facilities with dedicated medical staff, causing undersampling of the oldest age groups. We adjusted for this (“Age-specific factors” below).
We considered subjects vaccinated if they received that season’s influenza vaccine ≥.14 days before enrollment. For the 2009-2010 season, we only considered receipt of the 2009 monovalent vaccine.
Calculating differences in the age distribution between seasons
We defined the age distribution of each season as the number of cases of the dominant subtype in each of nine age groups (0-4 year-olds, 5-9 year-olds, 10-14 year-olds, 15-19 year-olds, 20-29 year-olds, 30-39 year-olds, 40-49 year-olds, 50-64 year-olds, and >64 years old). The G-test of independence was used to determine whether each pair of seasons had significantly different age distributions. We considered differences significant if the Bonferroni-corrected p-value was <0.05.
Calculating relative risk
We used an approach similar to Worby et al., 2015 in calculating relative risk. We defined the midpoint of each season as the week in which the cumulative number of cases of the dominant subtype exceeded 50% of the total for that season. Weeks before and after this point were assigned to the first and second half of the season, respectively. We assigned each case to one of the five age groups used by Worby et al., 2015 (0-4 year-olds, 5-17 year-olds, 18-49 year-olds, 50-64 year olds, and >64 years old). For each age group g, we defined relative risk as where Cfirst,t,g and Csecond,t,g are the fraction of cases of the dominant subtype in age group g during influenza season t that occurred during the first or second half of the season, respectively. A relative risk >1 indicates that cases in an age group were more likely to occur during the first half of the season.
Calculating imprinting probabilities
Seasonal intensity
We define the intensity of an influenza season as the product of the mean fraction of patients with influenza-like illness (ILI) and the percentage of specimens testing positive for influenza A that season, where ILIt is the mean fraction of all patients with ILI in season t adjusted for differences in state population size (CDC, 2018), Ft is the number of respiratory specimens testing positive for influenza A in season t, and Nt is the total number of respiratory specimens tested in season t. For seasons 1997-1998 through 2017-2018, these data were obtained from the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet) and the World Health Organization/National Respiratory and Enteric Virus Surveillance System (WHO/NREVSS) Collaborating Labs (CDC, 2018). For seasons 1976-1977 through 1996-1997, we assumed that the mean ILI was equal to the mean of mean ILI for seasons 1997-1998 through 2017-2018. We obtained data on Ft and Nt for these seasons from Thompson et al., 2003. We then normalized the intensity of each season by dividing It by the mean of It from the 1976-1977 through 2017-2018 seasons. For all seasons before 1976-1977, we assumed that the intensity of influenza A equalled the mean intensity of seasons 1976-1977 through 2017-2018.
Fraction of season experienced
We define the fraction of a given influenza season fw,t occurring in week w of season t as where ILIw,t is the weighted fraction of all patients with ILI in week w of season t, Fw,t is the number of respiratory specimens testing positive for influenza A in week w of season t, and Nw,t is the number of specimens tested in week w of season t. is the product of ILI and the fraction of positive influenza A specimens summed over all weeks of the influenza season t, where w0 is the first week of the season and wf is the final week of the season. We define the start of the influenza season as week 40 of the calendar year, which usually falls at the beginning of October. For seasons before 1997-1998, where weekly data is unavailable, we assume that the fraction of the influenza season experienced in week w is where is the mean fraction of the influenza season experienced at week w for all seasons after 1997-1998.
We use fw,t to calculate the fraction of an influenza season experienced by an individual born in year y. We assume that people born in year y are born randomly throughout the year. We also assume that due to maternal immunity, infants do not experience immunizing exposure to influenza until they are at least 180 days old. Let py,w,t be the proportion of individuals born in year y that are over 180 days old in week w of season t and ey,t be the fraction of individuals born in year y exposed to influenza season t. Then
Imprinting probability
We emulate the approach of Gostic et al., 2016 in calculating the probability that people born in a particular year had their initial influenza exposure to a particular subtype.
To obtain imprinting probabilities, we calculate the probability that an individual born in year y receives their first influenza A exposure in influenza season t. Specifically, we consider two possible scenarios. First, we assume that only infections result in an imprinting exposure. Second, we modify our calculation to include the possibility that both vaccination and infection result in an imprinting exposure.
We set the probability of infection for naive individuals at 0.28 (Bodewes et al., 2011; Gostic et al., 2016). Using this probability, we can calculate a per-season attack rate a assuming an exponential hazard: We then scale this attack rate by the intensity of influenza season t (It) and the fraction of influenza season t experienced by an individual born in year y (ey,t, “Seasonal intensity” above). The probability that a naive individual born in year y is infected in influenza season t is Considering only infection, We calculate subtype-specific imprinting probabilities by multiplying py,tN(t) by the subtype frequencies for each season (Figure 2-Supplement 1).
To incorporate vaccination, we make a simplifying assumption that for all seasons except the 2009 pandemic and the 2009-2010 influenza seasons (discussed below), vaccination occurs before infection. We also consider that given vaccination coverage for a particular birth cohort and season (cy,t), only a fraction of those individuals will be receiving their first vaccination because people who get vaccinated are more likely to get vaccinated again. We calculated this probability of first vaccination by age (fa,t) using the vaccination status of children enrolled in our study (Figure 5-Supplement 7).
We track the fraction of a birth cohort naive to any exposure (N(t) as above) and the fraction of a birth cohort naive to vaccination (Nv(t)). Therefore, to calculate imprinting probabilities, we first consider vaccination: Then, we update N(t) to reflect vaccination and use this new value of N(t) to calculate the fraction of people infected: During the 2009 H1N1 pandemic and 2009-2010 seasons, infection, vaccination with the seasonal vaccine, and vaccination with the monovalent vaccine occurred simultaneously. Therefore, we used weekly rates of vaccination and infection to estimate the probability that an individual’s first exposure for that season was infection, seasonal vaccination, or monovalent vaccination.
Vaccination coverage
Seasonal influenza vaccination coverage for MESA Central was collected by age in the 2007-2008 through 2017-2018 seasons using a real-time immunization registry (Irving et al., 2009). Monovalent vaccination coverage for the 2009-2010 season was obtained by directly measuring monovalent vaccination coverage in enrolled individuals and fitting a smoothing spline to the data (Figure 5-Supplement 6).
For seasons before 2007-2008, we used U.S. national data on vaccination coverage in children (2002-2003 through 2003-2004; Santibanez et al., 2006, 2004-2005 through 2006-2007; Santibanez et al., 2014). We assumed that vaccination coverage in children (i.e., potentially imprinting vaccination) was 0 before the 2002-2003 season, since the that was the first season in which the Advisory Committee on Immunization Practices encouraged children 6-23 months old to receive influenza vaccination (Bridges et al., 2002).
Model components
We aim to infer ps,t,y,v, the predicted fraction of all PCR-confirmed influenza cases of dominant subtype s in influenza season t among people born in year y with vaccine status v.
We normalize all models such that for each season t, . Let be the unnormalized proportions. Then for season t, For convenience, let kM,t, the normalizing constant for season t in model M, be
Demography
We used Marshfield-specific data on the age distribution for each season (Kieke et al., 2015). Individuals ≥.90 years old were grouped into a single age class. We therefore estimated the number of people in each age by assuming a geometric decline in the age distribution. We converted the raw age distribution for each season into a distribution by birth year by distributing people of a specific age into the two possible birth years of that age in a specific season. Specifically, we assumed that people were born uniformly throughout the year. We defined a breakpoint date prior to the start of the enrollment period based on when the the 6 month-old age limit cutoff was set (e.g., if the breakpoint date was Ocotober 1, then infants had to be 6 months old by that date to be eligible for enrollment). We used this date to calculate the fraction of people of age a in season t who were born in year t − y (f1,a,t) or year t − y − 1 (f2,a,t). A fraction f1,a,t of the total population of age a in season t was assigned to birth year t − y and f2,a,t to t − y − 1. Breakpoint dates ranged from September 1 through January 1 with the exception of the pandemic season which had a breakpoint date of May 1, 2009. The start of the enrollment period ranged from December to January with the exception of the 2009 pandemic season, when enrollment began in May 2009. For the 2009 pandemic season, we assumed that the age distribution was the same as the 2008-2009 season. The above procedure allows us to calculate Dt,y, the fraction of people born in year y during influenza season t. Therefore,
Age-specific factors
We modeled age-specific differences in influenza infection risk and healthcare-seeking behavior by using parameters that represent the relative risk of medically attended influenza A infection in each age group. These parameters combine the effects of underlying age-specific differences in influenza A infection risk as well as age-specific differences in healthcare-seeking behavior. We consider the same age groups as before (0-4 year-olds, 5-9 year-olds, 10-14 year-olds, 15-19 year-olds, 20-29 year-olds, 30-39 year-olds, 40-49 year-olds, 50-64 year-olds, and >64 years old). We choose 20-29 year-olds as our reference age group. All age groups aside from 20-29 year-olds have an associated parameter that models their risk of medically attended influenza A infection relative to 20-29 year-olds. These parameters can take on any positive value. To map these age-specific parameters to birth cohorts, we consider that each birth cohort has two possible ages in each season (a1 and a2). Let (a) be a function that specifies the age group g of a given age a. Then t,y the age-specific risk of medically attended influenza A infection for a person born in year y in season t is where fa1,t,y and fa2,t,y are the fractions of birth cohort y who are age a1 or a2 in influenza season t, and (a1) and (a2) are the age-group-specific parameters for a1 and a2. With this, we model age-specific effects as The relative rates at which different age groups were approached for study enrollment (the approachment rate, papproach) varied between seasons. Similarly, the relative rates at which different age groups enrolled in the study after being approached (the enrollment rate, penroll) also varied between seasons. Enrollment rates also varied between vaccinated and unvaccinated individuals.
We defined the approachment rate of an age group g in season t as where Napproached,t,g is the number of people in age group g during season t who were approached for enrollment, and NMAARI,t,g is the total number of people in the Marshfield cohort who presented with MAARI regardless of whether they were approached for enrollment.
We defined the enrollment rate of age group g in season t with vaccination status v as where Nenrolled,t,g,v is the number of people in age group g with vaccination status v who enrolled in the study in season t, and Napproached,t,g,v is the number of people in age group g with vaccination status v who were approached for enrollment in season t. Due to differences in data collection for the 2007-2008 and 2008-2009 seasons, complete vaccination records for eligible unenrolled individuals were not available, so we assumed that the enrollment rates by age group and vaccination status in those seasons were equal to the mean enrollment rate for each age group and vaccination status across all other seasons.
We normalized papproach,t,g by the value of papproach,t,g for the reference age group (i.e., 20-29 year-olds) in each season. Similarly, we normalized penroll,,t,g,v to the value of penroll,,t,g,v for unvaccinated members of the reference age group for each season. This yielded the relative approachment and enrollment rates and . We converted both and to birth-year specific covariates (i.e. covariates by y instead of g) using the same procedure described above for the estimated age-specific parameters.
Finally, the study did not enroll residents of skilled nursing facilities with dedicated medical staff. To account for this, we estimated the proportion of the population in nursing facilities within the study area. We obtained the total number of beds in nursing facilities within the Marshfield study area in 2018 from the Wisconsin Department of Health Services (WDHS, 2018). We assumed that the total number of beds did not change between 2007-2008 and 2017-2018. We also used data from the Centers for Medicare and Medicaid Services (CMS, 2015) to calculate the percent of beds occupied in Wisconsin nursing facilities by age for 2011 through 2014 and the fraction of people in a nursing facility by age group. We used a smoothing spline to obtain the fraction of people of a given age in a nursing facility. For seasons before 2010-2011 and after 2013-2014, we assumed that the fraction of people of a given age in a nursing facility was the average value for 2011-2014. Given the total population of the study area by age and season, we could then calculate the fraction of people in a given age a and season t who are in nursing facilities (st,a). We convert this to a covariate by birth year (st,y) using the same procedure described above for the age-specific parameters.
Thus, the combination of estimated age-specific effects and age-specific covariates is modeled as
Vaccination
Vaccinated individuals may seek healthcare for symptomatic influenza at a different rate than unvaccinated individuals. Moreover, because vaccines are routinely recommended for individuals with underlying health conditions, pre-existing susceptibility to acute respiratory infection among vaccinated individuals may also differ from unvaccinated individuals. Let Rt,g represent the fraction of vaccinated individuals in age group g in season t that present with MAARI. We use test-negative controls to estimate this as where and are the number of vaccinated or unvaccinated individuals born in year g presenting with MAARI and testing negative for influenza in season t. We compared this quantity to the vaccination coverage of age group g in season t, ct,g (Figure 3-Supplement 2).
We converted Rt,g to Rt,y (i.e., to a birth cohort-indexed covariate) using the same procedure described above to convert age group-specific parameters to birth-cohort-specific parameters.
We tested five different VE schemes: subtype-specific VE that remained constant across seasons and cohorts (2 parameters), subtype-specific VE that varied between the age groups described above (18 parameters), VE that varied between seasons (12 parameters), VE for each possible imprinting subtype (6 parameters), and birth-cohort-specific VE (18 parameters). These VE parameters (V) reduce the probability of medically attended influenza A infection among vaccinated individuals within a birth cohort, i.e., where V depends on the specific implementation of VE used.
For constant VE, V = Vs = 1 − vs.
For season-specific VE, V = Vs,t = 1 − vs,t.
For age-specific VE, we use a similar approach as described above for the age-specific parameters. We use the same age classes but do not consider a reference age class, so that each age group has an associated VE parameter for each subtype. Therefore, where vG(a1),s and vG(a2),s are age-specific VE parameters for a1 and a2. Recall that the function G specifies an age group for a given age.
For imprinting-specific VE, we use the imprinting probabilities for each birth cohort described above such that where vs,z is the VE among people imprinted to subtype z against infection by dominant subtype s, and mz is the imprinting probability for subtype z in season t for birth cohort y.
For birth-cohort-specific VE, we defined nine birth cohorts corresponding to the nine age groups we used for the 2017-2018 season: 1918-1952, 1953-1967, 1968-1977, 1978-1987, 1988-1997, 1998-2002, 2003-2007, 2008-2012, and 2013-2017. Let Q(y) be the birth cohort of people born in year y. Then where vQ(y),s is the VE among people in cohort Q(y) against infection by dominant subtype s.
N2 imprinting
We consider that imprinting to N2 reduces a birth cohort’s risk of H3N2 infection. Therefore, where nm is the strength of N2 imprinting, and mH3N2,t,y and mH2N2,t,y are the imprinting probabilities of birth cohort y in season t to H3N2 and H2N2.
HA subtype imprinting
We consider that imprinting to HA reduces a birth cohort’s risk of future infection from the same HA subtype. Therefore, where hs is the strength of HA imprinting for subtype s. and ms,t,y is the imprinting probability of birth cohort y in season t to subtype s.
HA group imprinting
We consider that imprinting to HA reduces a birth cohort’s risk of future infection from the viruses within the same HA group. Therefore, where g1 is the strength of HA imprinting for group 1 viruses and g2 is the strength of HA imprinting for group 2 viruses.
Vaccine imprinting
We consider that imprinting via vaccination confers a fraction (x) of the protection conferred by infection. If x = 0, vaccination prevents imprinting via infection without protecting against infection in future seasons. If x = 1, vaccination imprints as well as infection. Because seasonal vaccines are polyvalent, we assume that imprinting via vaccination protects against both H1N1 and H3N2 infections. Imprinting via vaccination by the monovalent pandemic vaccine only protects against H1N1 infections. Therefore, for subtype-specific imprinting, where mv,t,y is the probability of imprinting via vaccination in season t for birth cohort y. Similarly, for group-specific imprinting, In models including vaccine imprinting, the imprinting probabilities for infection differ from the infection-only model. That is, we use the imprinting probabilities from Figure 5-Supplement 3 and not the probabilities from Figure 2. We assume that the protection conferred by imprinting via vaccination cannot exceed protection conferred by initial infection and therefore restrict x to lie between 0 and 1.
Model likelihood
Let ns,t,y,v be the number of PCR-confirmed influenza cases of dominant subtype s in influenza season t among people born in year y with vaccination status v. The total number of PCR-confirmed cases of dominant subtype s in season t is For models fitted to a restricted set of ages, we limited the cases for each season to the birth cohorts that were guaranteed to meet the age requirements in that season.
We aim to infer ps,t,y,v, the predicted fraction of all PCR-confirmed influenza cases of subtype s in influenza season t among people born in year y with vaccination status v.
For a specific model M, we consider all possible model components j described above (demography, age, vaccination, and imprinting). Then, where Mj indicates whether model M contains component j (e.g., for HA subtype imprinting, j = 1 − hsms,t,y).
The likelihood for season t is given by the multinomial likelihood, where ymax,t is the maximum birth year possible for a specific season t.
The full model likelihood for all observed seasons is We fitted the model to case data using the L-BFGS-B algorithm implemented in the R package optimx. We estimated 95% confidence intervals for parameters of the best-fitting model by evaluating likelihood profiles at 15 evenly spaced points and interpolating the entire profile using a smoothing spline.
Sensitivity analyses
Sensitivity to age groups
To test whether our models were sensitive to our choice of age groups, we fit revised versions of all our models with different age groups:
0-4 years, 5-17 years, 18-49 years, 50-64 years, and ≥.65 years
0-4 years, 5-17 years, 18-64 years, and ≥.65 years
These models with alternate age groupings were fitted to case data to determine whether our findings on the strength of protection from initial H1N1 and H3N2 infection significantly changed from our fits using the higher-resolution age grouping described above (Appendix 1 Table 3).
Sensitivity to sampling effort
Sampling effort was not even across seasons, and analysis of the number of influenza cases per sampling day suggested that a significant number of cases may have been missed at the beginning or end of a specific seasons (Figure 5-Supplement 1). As our analysis of relative risk indicates, different age groups are more susceptible during different points in the influenza season, and therefore missing data from the beginning or end of a season could introduce bias in the observed age distribution of cases.
To adjust for this, we simulated cases for seasons which did not have sufficient sampling of the start or end of the epidemic period. We considered a season sufficiently sampled if
the number of cases per sampling day in the first week of the enrollment period was <1 and
the number of cases per sampling day in the last week of the enrollment period was <1.
To extrapolate the start of a season, we linearly regressed the number of cases of the dominant subtype per sampling day for each week of the first half of the season and identified the week of the season where the number of cases per sampling day fell below 1 (t0). For each week from t0 to the first week of the enrollment period, we used the regression of cases per sampling day to calculate the number of cases we expected to see in each week. Summing these yields the total number of unsampled cases at the beginning of the season. We used a similar approach to extrapolate the number of unsampled cases at the end of a season by instead regressing cases per sampling day for each week of the latter half of the season. We did not extrapolate cases for the 2010-2011 season for this analysis since the observed number of cases per sampling day did not follow a typical epidemic curve.
We stochastically assigned a birth year and vaccination status to these cases according to a multinomial distribution. The success probabilities of this distribution were set using the age distribution of cases of the dominant subtype from the first two weeks of the enrollment period (if extrapolating the beginning of a season) or the last two weeks of the enrollment period (if extrapolating the end of a season). Specifically, we calculated the distribution of observed cases in the first or last two weeks of the enrollment period among nine age groups (described above in “Age-specific factors”) with their associated vaccination status. We then assumed that cases were uniformly distributed among all birth years contained in an age group. This yielded a set of probabilities describing the probability of infection given birth year and vaccination status in a specific season.
We sampled from these multinomial distributions 1000 times to obtain augmented datasets that combined observed and extrapolated cases. For each replicate simulation, we calculated the age distribution of cases for the entire season as well as the relative risk of each age group in the first versus the latter half of the season (Figure 1-Supplement 2B). We also fitted the best-fitting model to 100 of these datasets (excluding the 2010-2011 season) and recorded the estimated imprinting strength for both H1N1 and H3N2 for each fit (Figure 5-Supplement 2).
Calculating excess cases
We defined excess cases for a given birth cohort or age group as the number of observed cases for that birth cohort or age group minus the number of predicted cases for that age group. Predictions were obtained by multiplying the multinomial probabilities produced by the model by the total number of cases of the dominant subtype in each season. A 95% prediction interval was obtained by simulating 100 datasets using the multinomial probabilities from a specific model (Figure 6-Supplement 2, Figure 7).
To test whether recent infection might be confounding our estimates, we calculated the correlation between excess cases in each birth cohort in each season with excess cases of the same birth cohort in the next season with the same dominant subtype (Figure 5-Supplement 5).
Data Availability
Data for this study are available at www.github.com/cobeylab/FluAImprinting
Code and data availability
The code and data used to perform the analyses for this project are available at https://github.com/cobeylab/FluAImprinting.
Ethics
Human subjects: Study procedures for the vaccine effectiveness study was approved by the IRB at the Marshfield Clinic Research Institute. Informed consent was obtained from all participants at the time of enrollment into the vaccine effectiveness study. This analysis was subsequently approved by the IRB with a waiver of informed consent. The analysis of data was approved by the University of Chicago IRB.
Supplementary tables and figures
Acknowledgments
We thank Jennifer King and Carla Rottscheit for their assistance in providing the data for this study and Rohan Dandavati for compiling historical data on subtype frequencies and ILI. We thank Marcos Vieira and Kangchon Kim for their assistance in calculating imprinting probabilities. We also thank the study participants for their time. This work was completed with computational resources provided by the University of Chicago’s Research Computing Center. Funding for this project was provided by the National Institutes of Health (NIH), Department of Health and Human Services, under grant DP2AI117921 (to SC) and CEIRS Contract No. HHSN272201400005C (to SC). HQM receives research support from Seqirus unrelated to this work. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.