Abstract
Current models for flu-like epidemics insufficiently explain multi-cycle seasonality. Meteorological factors alone, including associated behavior, do not predict seasonality, given substantial climate differences between countries that are subject to flu-like epidemics or COVID-19. Pollen is documented to be allergenic, plays a role in immuno-activation, and seems to create a bio-aerosol lowering the reproduction number of flu-like viruses. Therefore, we hypothesize that pollen may explain the seasonality of flu-like epidemics including COVID-19 in conjunction with meteorological variables.
We tested Pollen-Flu Seasonality Theory for 2016-2020 flu-like seasons, including COVID-19, in The Netherlands with its 17 million inhabitants. We combined changes in flu-like incidence per 100K/Dutch citizens (code: ILI) with pollen concentrations and meteorological data. Finally, a predictive model is tested using pollen and meteorological threshold values inversely correlated to flu-like incidence.
We found a highly significant inverse correlation of r(224)= -0.41 (p < 0.001) between pollen and changes in flu-like incidence corrected for incubation period. The correlation was stronger after taking into account incubation time. We found that our predictive model has the highest inverse correlation with changes in flu-like incidence of r(222) = -0.48 (p < 0.001) when average thresholds of 610 total pollen grains/m3, 120 allergenic pollen grains/m3, and a solar radiation of 510 J/cm2 are passed. The passing of at least the pollen thresholds, preludes the beginning and end of flu-like seasons. Solar radiation is a co-inhibitor of flu-like incidence, temperature makes no difference, and higher relative humidity associates even with flu-like incidence increases.
We conclude that pollen is a predictor of the inverse seasonality of flu-like epidemics including COVID-19, and that solar radiation is a co-inhibitor. The observed seasonality of COVID-19 during Spring, suggests that COVID-19 may revive in The Netherlands after week 33, preceded by the relative absence of pollen, and follows pollen-flu seasonality patterns.
1. Introduction
Current models for flu-like epidemics insufficiently explain multi-cycle seasonality. Meteorological factors alone do not fully explain the seasonality of flu-like epidemics (Tamerius et al., 2011) or COVID-19 (Yao et al, 2020). Pollen is documented to be allergenic (Klemens et al, 2007; Rosenwasser, 2011; Howarth, 2000), plays a role in immuno-activation (Brandelius et al, 2020), and there are some indications that pollen might be antiviral (Wachsman et al, 2000; Ghanem et al., 2015) and anti-influenza (Chen, 2016). Interestingly, “other allergic diseases” are absent as a co-morbid condition of COVID-19 according to Zhang et al. (2020), although more confirmation is needed before conclusions can be drawn. One more indication for a pollen effect might be from a controlled experiment indicating that flowering (and foliage) plants in hospital rooms lead to positive physiological responses and a reduced recovery time for post-surgery patients (Park & Mattson, 2008). At the other hand, it is reported that pollen might suppress certain immune parameters in non-allergic subjects and is found to be correlated to rhinovirus-positive cases (Gilles et al, 2020), but not to other flu-like virus positive cases (Nivel.nl, 2020).
Recently, we identified pollen bio-aerosol as a discrete seasonal factor in inhibiting flu-like epidemics for the period 2016 to 2019 in The Netherlands (Hoogeveen, 2020). In this epidemiological study, we found strong inverse correlations between allergenic pollen concentrations and hay fever on the one hand, and flu-like incidence on the other hand. The study was based on the persistent observation that pollen and flu season predictably alternate each other in moderate climate zones, and the absence of sufficient meteorological explanations (Tamerius et al., 2011). We further observed that the passing of pollen threshold values of around 100 allergenic pollen grains/m3, reliably mark the onset and decline of moderate flu-like epidemic lifecycles, and thus might be used as predictor. Such a concentration of allergenic pollen makes sense as in an overview of real-life studies, clinical pollen threshold values are observed between 1 and 400 grains/m3, whereby the first symptoms are typically observed in the range of 1 – 50 grains/m3 (De Weger et al, 2013), depending on country, period, and vegetation of study, and probably the susceptibility of subjects to allergens.
The seasonality of respiratory viral infections has already been recognized for thousands of years in temperate regions (Moriyama et al., 2020). More in detail, virologists observed that the cold, and flu-like epidemics (e.g., influenza and corona caused) “go away in May” in the Northern Hemisphere, while emerging in the Southern Hemisphere with its opposite seasonality, to re-emerge in the Northern Hemisphere during its next Autumn and Winter in a slightly mutated form. Furthermore, all new flu-like pandemics since 1889 typically emerged in the Northern Hemisphere at the tail-end of respective flu-seasons (Fox et al., 2017), whereby the current COVID-19 pandemic is clearly no exception. The emergence of COVID-19 and other pandemics at the tail-end of flu season makes sense. It takes time for a spontaneous new cross-over virus with a sufficiently high reproduction number (Ro) – for SARS-CoV-2 it is estimated to be initially around 3 (Liu et al., 2020) - to develop from patient 0 to a full-fledged pandemic during flu season on the Northern Hemisphere. Chances for the Northern Hemisphere with its larger populations are higher to be the initial breeding ground for a new flu-like pandemic than the Southern Hemisphere. Fox et al. showed further that most flu-like pandemics are multi-wave, whereby the initial wave at the tail-end of flu season is typically short-lived. This gives rise to the suspicion that COVID-19 is subject to such multi-wave seasonality as well (Kissler et al., 2020) and has a short waive at the tail end of the 2019/2020 flu-like season in the Northern Hemisphere in the temperate climate zone.
Numerous studies try to explain flu-like seasonality from meteorological factors such as sun light including UV radiation (Schuit et al., 2020), temperature and humidity (Chong et al., 2020; Shaman et al, 2011). However, Postnikov (2016) concluded that ambient temperature is not a good predictor for influenza seasonality in The Netherlands, and inconsistent correlations are found for the relation between COVID-19 and temperature as well (Toseupu et al, 2020; Xie & Zhu, 2020; Ma et al 2020; Qi et al, 2020). Also findings about the relation between humidity and influenza (Soebiyanto et al., 2014), and humidity and COVID-19 (Ahmadi et al, 2020; Ma et al, 2020; Qi et al, 2020) are equally inconsistent. Although UV light is detrimental for the flu-like virus aerosol under laboratory conditions, associated with immuno-activation (Abhimanyu & Coussens, 2017; Tan & Ruegiger, 2020), and circadian rhythms regulating lung immunity (Nosal et al., 2020), the onset of flu season, halfway August in The Netherlands, coincides with an annual peak in hot, sunny days and is still in the middle of the Summer season. According to Yao et al. (2020) also for the decrease of COVID-19 infections, nor high UV nor high temperature are good predictors. The contradictory findings related to COVID-19, understandingly based on the analysis of a limited part of the year and disease cycle, might be partly due to sub-seasonal bias and unstandardized data collection methods. With sub-seasonal bias, we mean that if only a part of a season or cycle is analyzed, overly specialized conclusions can be drawn which cannot be generalized to the whole season or cycle.
Nevertheless, these meteorological variables are known factors in flowering, and pollen maturation and dispersion. Meteorological variables such as increased solar radiation and temperature – among others the absence of frost - are not only triggering flowering and pollen maturation, but also affect the pollen bio-aerosol formation: dry and warm conditions stimulate pollen to be airborne. Rain, to the opposite, makes pollen less airborne, and cools the bio-aerosol down. Very high humidity levels (RH 98%) are even detrimental for pollen (Guarnieri, 2006). The RH 98% effect on pollen, could thus provide an alternative explanation of why flu-like incidence in tropical countries is higher during rainy season, and reduced during the rest of the year.
We hypothesize that there is an inverse effect of pollen bio-aerosol on flu-like incidence including COVID-19 (see Figure 1), whereby pollen are known to be triggered and influenced by meteorological variables, which can then jointly explain the seasonality of flu-like incidence. The indirect explanation of the pollen effect is based on the fact that pollen bio-aerosol and UV light exposure lead to immuno-activation, and sometimes allergic symptoms, which might reduce the effectiveness of flu-like viruses. The indirect pollen effect is explained by the spread of pollen bio-aerosol under sunny and dry conditions, which might absorb viral bio-aerosol, and might as well become an agent that leads to an increased exposure of flu-like viruses to virus-degrading solar radiation including the UV spectrum, and - possibly – anti-viral properties of pollen.
To further understand the impact of pollen as an environmental factor influencing the life cycle of flu-like epidemics, the objective of this study is to determine the correlations of pollen and meteorological variables with (changes in) flu-like incidence, develop and test a discrete predictive model that combines pollen and meteorological co-inhibitors, and infer whether COVID-19 in the tail-end of the 2019/2020 flu-like season is able to defy flu-like seasonality or not. Our main hypothesis therefore is that pollen are the missing link, jointly explaining with certain meteorological variables, flu-like seasonality including COVID-19, and that a compound threshold based factor - combining detected flu-inhibitors - is a good unified predictor of such seasonality.
2. Methods
To study the relation between pollen and flu-like incidence in The Netherlands, we used the public datasets of Elkerliek Hospital (Elkerliek.nl) about the weekly allergenic, low-allergenic and total pollen concentrations in The Netherlands in grains/m3, whereby for 42 types of pollen particles the numbers are counted and averaged per day per 1 m3 of air. The common Burkard spore trap is used through which a controlled amount of air is ingested. The applied classification and analysis method is conform to the standard by EACCI (European Academy of Allergology and Clinical Immunology) and the EAN (European Allergy Network). Allergenic pollen includes nine types of particles that are classified as moderate (Corylus, Alnus, Rumex, Plantago and Cedrus Libani), strong (Betula and Artemisia), or very strong allergenic (Poaceae and Ambrosia). Additionally, we included low-allergenic pollen concentrations in addition to the allergenic ones as it should not matter for the bio-aerosol filter function (see Figure 1) whether pollen is allergenic or not. Low-allergenic pollen includes the other 33 particles that are classified as non-allergenic to low-allergenic (Cupressaceae, Ulmus, Populus, Fraxinus, Salix, Carpinus, Hippophae, Fagus, Quercus, Aesculus, Juglans, Acer, Platanus, Pinus, Ilex, Sambucus, Tilia, Ligustrum, Juncaceae, Cyperaceae, Ericaceae, Rosaceae, Asteraceae, Ranunculaceae, Apiaceae, Brassicaceae, Urtica, Chenopodiaceae, Fabaceae, Humulus, Filipendula, and Indet). Total pollen concentration is the sum of the average allergenic and low-allergenic pollen concentrations. One advantage of the total pollen metric is that there are hardly 0 values (only 3 out of 266), and we do not need to limit ourselves to just parts of the seasonal cycle which could import sub-seasonal bias in our research. We assume that long distance pollen transport is also accounted for, as foreign pollen will be counted as well, by the pollen measuring station that works all year round.
Further, we use the data from the Dutch State Institute for Public Health (RIVM.nl) gathered by Nivel (Nivel.nl) about weekly flu-like incidence (WHO code “ILI” - Influenza Like Illnesses) reports at the primary medical care, per 100,000 citizens in The Netherlands. Primary medical care is the day-to-day, first line healthcare given by local health care practices to their registered clients as typical for The Netherlands with a population of currently 17.4 million. The reports relate to a positive RIVM laboratory test for ILI after a medical practitioner diagnosed ILI after a consult, whether that leads to hospitalization or not. The ILI metric is according to a standardized WHO method as ILI data are gathered and compared globally. ILI is defined by WHO as a combination of a measured fever of ≥ 38 °C, and cough, with an onset within the last 10 days. The flu-like incidence metric is a weekly average based on a representative group of 40 primary care units, and calculated using the number of influenza-like reports per primary care unit divided by the number of patients registered at that unit, averaged for all primary care units, and next extrapolated to the complete population. The datasets run from week 1 of 2016 till week 18 of 2020 (n = 226 data points) to include the recent COVID-19 pandemic in the tail-end of the 2019/2020 flu-like season. To underpin the relative importance of COVID-19: SARS-CoV-2 is detected in the Netherlands since week 9, 2020. According to figures of Nivel.nl (2020; see Figure 2), from week 13 on, SARS-CoV-2 is the outcome of the (vast) majority of positive tests for patients at primary care with flu-like complaints, whereby at week 18 even 100% of positive tests indicate SARS-CoV-2 (other tested viruses are five Influenza A and B subtypes, RSV, Rhinovirus and Enterovirus). A small bump in rhinovirus positive cases is seen after week 18 in 2020, so not influencing our ILI statistics till week 18, at least not for 2020.
Further, we included meteorological datasets form the Royal Dutch Meteorological Institute (KNMI.nl), including average relative humidity/day, average temperature/day and global solar radiation in J/cm2 per day as an indicator of UV radiation, from its centrally located De Bilt weather station. Next, we calculated the weekly averages for the same periods as used in the other datasets. De Bilt is traditionally chosen as it provides an approximation of modal meteorological parameters in The Netherlands, which is a small country. Furthermore, all major population centra in The Netherlands are within a radius of only 60 kilometer from De Bilt which covers around 70% of the Dutch population. So, we assume in this study that the measurements of De Bilt are sufficiently representative for the meteorological conditions that the Dutch population experiences on average.
To test allergenic versus low-allergenic pollen assumptions, against hay fever and pre-covid-19 flu-like incidence, we make use of the hay fever index. The hay fever index is defined as turnover for hay fever medication as reported by all Dutch pharmacies to the Dutch Central Bureau of Statistics (CBS.nl) based on respective ATC codes (especially R01A/R01AC). We use a dataset from week 1 of 2016 till week 10 of 2019 (n=166 data points), as no further data has been made available. For the interpretation of findings, we assume for The Netherlands a prevalence of allergic rhinitis that is more or less similar to that in Western Europe being around 23% and frequently undiagnosed (Bauchau & Durham, 2004). Further, it can be noted that the prevalence of allergic diseases in general in the Netherlands is around 52% (Van de Ven et al, 2006).
Datasets are complete, except that there are three weekly pollen concentration measurements missing (1.3%) because of a malfunctioning monitoring station, namely week 26 of 2016, week 21 of 2017 and week 22 of 2019. These missings appear to be completely at random. We imputed missing values to avoid bias and maintain power. We used a four weeks surrounding average to estimate the three missing data points to avoid breaking lines in visuals. We checked that these missing data have no material impact on the results by comparing these averages with the data of previous years for similar periods, and by observing if removal from statistical tests has any effect on outcomes and conclusions.
Regarding the incidence of flu-like symptoms, we calculated the weekly change compared to the previous week (ΔILI=ILIt – ILIt-1) to get an indication of the flu-like epidemic life cycle progression, whereby a decline is interpreted as Ro<1 and an increase as Ro>1 (Ro is the reproduction number of flu-like viruses). Further, to cover, in one time-series metric, for changes in flu-like incidence as well as for an incubation period of up to two weeks, we calculated a three weeks moving average (3WMA) of changes in flu-like incidence, of which two weeks are forward looking:
Whenever we use the term incubation time, we also mean to include reporting delay (estimated to be around 4.5 days). We don’t assume delay effects for meteorological variables or pollen concentrations, so don’t calculate moving averages for other time series.
Compared to our previous study (Hoogeveen, 2020) there is an overlap in datasets of less than 10%, given the extension in time, the addition of meteorological datasets and non-allergenic pollen, and the introduction of newly calculated variables such as total pollen concentration, ΔILI3WMA, the compound predictor and the log10 transformations on pollen, ILI and the hay fever index.
We formulated the following statistical null hypotheses for falsification:
H10: there are no inverse correlations for total pollen concentrations with flu-like incidence (corrected for incubation period).
H20: there are no inverse correlations between pollen and changes in flu-like incidence (ΔILI or corrected for incubation time: ΔILI3WMA).
H30: there is no predictive significance of a discrete model’s compound value, based on thresholds for pollen and meteorological co-inhibitors, related to changes in flu-like incidence (ΔILI3WMA).
To understand the role of meteorological variables, to check if - in our datasets - meteorological variables show their well-established effects on pollen as assumed, and to select co-inhibitors:
H40: meteorological variables – solar radiation, temperature and relative humidity – have no effect on pollen and/or flu-like incidence change (ΔILI3WMA).
Low-allergenic pollen are known to have sometimes a little allergenic effect. To understand how to interpret adding none-to-low-allergenic pollen to the total pollen metric, we want to verify their effects on the hay fever index:
H50: low-allergenic pollen has no effect on hay fever and (changes in) flu-like incidence.
Note that except H5, all hypotheses are related to potential causality: the temporal sequentiality (temporality) of the respective independent variables, and flu-like incidence corrected for incubation period. Whenever we refer to temporality, we mean to indicate that the datasets behave as if there is causality, in the understanding that statistics alone cannot proof causality in uncontrolled settings.
Statistical analyses
Variables are presented with their mean (M) and standard deviation (SD).
We calculated correlation coefficients to test the hypotheses to assess the strength and direction of relationships. We use the full datasets, to avoid sub-seasonal bias, and by extending the number of years, the distortions by incidental and uncontrolled events are supposed to be minimized.
Next, linear regression (F-test) on identified inhibitors and interactions is used descriptively to determine whether the relation can (statistically) be described as linear, and to determine the equation using estimates and intercept values, and produce probability, significance level, F-value, and the Multiple R squared correlation to understand the predictive power of the respective inhibitor. Standard deviations and errors, and degrees of freedom (DF) are used as input for calculating the 95% probability interval. We report in the text the outcome of statistical tests in APA style, adapted to journal requirements. For relations that appear non-linear - logarithmic or exponential – we use the log10 function to transform the data if that makes the relation appear linear, before re-applying linear regression. We use the log10 transformed datasets also for the calculation of correlation coefficients to correct for skewness.
Finally, we created a simple, discrete model resulting in one compound value using selected flu-like inhibitors, to determine the optimal average threshold values for these inhibitors which have the highest joint correlation with changes in flu-like incidence (ILI3WMA). We apply linear regression (F-test) to understand the predictive power of the compound value, and determine the linear equation when significant. By constructing one compound independent variable, we cover for collinearity or interaction effects between joined co-inhibitors. In our analysis, we base the compound value on three selected thresholds. For example, when one threshold value is passed this leads to a compound value = 1, and when all three threshold values are passed this leads to a compound value = 3. Therefore, the values of the compound value are in the range of [0, 3].
The compound value equation can be expressed as follows, whereby iv = the respective independent variable that acts as inhibitor of flu-like incidence and k relates to the respective calculated threshold value. For each respective threshold passed (iv > k), +1 is added to the compound value (CV):
For three selected co-inhibitors, this takes in Excel the form of CV = IF(IV1>K1;1;0) + IF(IV2>K2;1;0) + IF(IV1>K3;1;0), whereby we use a threshold value for solar radiation (kr), and both pollen threshold values for allergenic (kap) and total pollen (kp) for K1, K2 and K3 respectively, as will be shown in the section 3.
It is outside of the scope of this research to verify the underlying datasets of Elkerliek Ziekenhuis, RIVM/Nivel, CBS, and KNMI by examining the validity and reliability of data collection methods as these institutes have well-established and internationally standardized protocols for data collection and verification.
All regression analyses are done using the statistical package R version 3.5.
3. Results
The M (and SD) for total pollen concentrations are M= 732 grains/m3 (SD = 1368), for allergenic pollen M = 349 grains/m3 (SD = 987 grains/m3), for low-allergenic pollen M = 383 grains/m3 (SD = 626 grains/m3) and for flu-like incidence 47 incidence/100K citizens per week (SD = 40.2 incidence/100K citizens). For the log10 transformed data the M (and SD) are as follows: total pollen concentrations M = 2.17 (SD = 0.98), allergenic pollen M = 1.84 (SD = 0.86), low-allergenic pollen M = 1.85 (SD = 1.03), and flu-like incidence M = 1.54 (SD = 0.35). For the hay fever index M = 101 (SD = 115.7), and for its log10 transformed data M = 1.81 (SD = 0.39). For the correlation coefficients below, we use the log10 transformed data for respective variables, as it corrects for skewness.
When further inspecting the datasets regarding pollen concentrations and flu-like incidence reported by primary medical care in The Netherlands, it is clear that there are continuous pollen bursts (Figure 3), whereby only a few of these pollen bursts are classified as more allergenic (Figure 6). These bursts of pollen, allergenic or low-allergenic, typically coincide with and precede a decline of flu-like incidence.
The correlation for total pollen and flu-like incidence is highly significant when taking into account incubation time: r(222) = -0.40, p < 0.001. We can thus reject the null-hypotheses H10 in favor of the alternative hypothesis that there is a negative correlation between total pollen and flu-like incidence, including the first cycle of the COVID-19 pandemic, when taking into account incubation time.
Further, we can reject H50 in favor of our assumption that it makes sense to include low-allergenic pollen concentrations in our study as well: low-allergenic pollen are inverse correlated to flu-like incidence (r(221) = -0.37, p < .00001), especially when corrected for 2 weeks incubation time (r(219) = -0.53, p < .00001).
That the correlations become stronger when taking into account incubation time, implies temporality. Furthermore, we can also observe from Figure 3 that flu-like incidence starts to decline after the first pollen bursts. And that flu-like incidence starts to increase sharply after pollen concentrations become very low or close to zero. This is a qualitative indication of temporality. Further, we can notice that the first COVID-19 cycle behaves according to pollen-flu seasonality, at least does not defy it.
When testing the impact on ΔILI, the weekly changes in medical flu-like incidence (M = -0.25 per 100K/citizens, SD = 15.4 per 100K/citizens), the extended dataset till 2020, including COVID-19, shows a strong and highly significant inverse correlation with total pollen (r(222) = -0.26, p = 0.000089). Therefore, we can falsify the null-hypothesis (H20) that there are no inverse correlations between the weekly pollen concentrations and weekly changes in flu-like incidence (ΔILI), including the period covering the first cycle of the COVID-19 pandemic. This inverse correlation provide thus further support for the alternative hypothesis that the presence of an elevated level of pollen has an inhibiting effect on flu-like incidence, and starts to immediately influence the direction and course of the epidemic life cycle. Also during the COVID-19 dominated period of the last 9 weeks, it appears that flu-like incidence behaves according to the expected pollen-flu seasonality. This strengthens the idea that COVID-19 itself might be seasonal as well, like all other flu-like pandemics since the end of the 19th century. Also when looking at other data from RIVM.nl about COVID-19 hospitalizations, we cannot conclude that COVID-19 breaks through the seasonal barrier. For example, new COVID-19 hospitalizations went down from a peak of 611 on March 27 to just 33 on May 3, the last day of week 18.
Using the three weeks moving average (ΔILI3WMA) of changes in flu-like incidence (M = -0.26 per 100K/citizens, SD = 8.9 per 100K/citizens), the correlation coefficients become stronger and are again highly significant for total pollen concentration (r(223) = -0.41, p < 0.00001). We can thus also reject the null-hypothesis (H20) that there is no inverse relation between pollen and ΔILI3WMA, the 3 weeks moving average of changes in flu-like incidence including incubation time. As this correlation (see also Figure 4) is stronger than if not corrected for incubation period, it is a further indication of temporality, and as it is stronger now with the 2019/2020 flu-like season included, it appears to support the idea that COVID-19 is subject to pollen induced flu-seasonality as well.
Linear regression analysis shows that there is a highly significant inhibitory effect of pollen on flu-like incidence change (ΔILI3WMA) of F(1, 222) = 37.1, p < 0.001 (see Table 1, line 1), as a further basis for using total pollen concentration as a predictor. A Log10 transformation of pollen to compensate for visual non-linearity leads to a similar outcome: F(1, 219) = 43.87, p < 0.001 (see Table 1, line 4). At least visually, it gives a good fit (see Figure 4).
In line with the correlation between pollen and flu-like incidence, the correlation between total pollen concentration and hay fever is stronger (r(162)= 0.76, p < 0.00001) than those for allergenic and low-allergenic pollen individually. This confirms that we can best use total pollen concentration as predictor. Univariate regression analyses show that there is a highly significant positive effect of total pollen on hay fever incidence, which in turn has a highly significant inhibitory effect on flu-like incidence (see Table 2).
Low-allergenic pollen has as well a highly significant effect on hay fever: r(160)= 0.77, p < 0.00001. We can thus reject the null-hypothesis H50 in favor of the alternative hypothesis that also low-allergenic pollen has a positive effect on hay fever. This might imply that pollen classified as none-to-low-allergenic might still be responsible for certain allergic effects, and not just the more allergenic pollen. Therefore, trying to use low-allergenic pollen to discriminate effects outside the allergenic path regarding the immune system might be challenging.
The nature of the relation between hay fever and flu-like incidence, might be statistically described as linear, but could be better described as logarithmic (Figure 5). In the context of this study, we interpret it as a further indication that it could be described as a thresholds based switching pattern as well, conforming the thresholds-based approach we take in our compound value as calculated below.
The expected effects of relative humidity (r(223) = -0.86, p < 0.0001), temperature (r(223) = 0.41, p < 0.0001) and solar radiation (r(223) = 0.67, p < 0.0001) on total pollen are found. So, there is indeed more pollen with sunny, warmer and dry weather. We can reject the null-hypothesis that meteorological variables (H40) have no effect on pollen, for solar radiation (M = 1047 J/cm2, SD = 709 J/cm2), temperature (M = 10.8 °C, SD = 5.8 °C), and relative humidity (M = 79%, SD = 8.3%), whereby relative humidity is reducing the amount of aerosol pollen.
Counter to findings in other studies, relative humidity is positively associated to changes in flu-like incidence (ΔILI3WMA) in The Netherlands (r(224) = 0.34, p < 0.00001). Dutch flu season is cold and humid, and on rainy days the effect of pollen and solar radiation are reduced. Although temperature strongly correlates with flu-like incidence (r(226) = -0.82, p < 0.0001), it has a negligible effect on Δ ILI, weekly changes in flu-like incidence (r(224) = -0.02 n.s.), also when corrected for incubation time. Therefore, it seems unlikely that temperature has a direct effect on aerosol flu-like viruses and the life cycle of a flu-like epidemic. In line with this, temperature is also not a good marker for the onset or the end of flu season as the end of flu season (Ro<1) can coincide with an average temperature of close to 0 °C and the start of flu season (Ro>1) can coincide with temperatures as high as 17 °C in The Netherlands.
Of the meteorological variables, only for solar radiation there is a highly significant inverse correlation with changes in flu-like incidence (ΔILI3WMA): (r(224) = -0.25, p = 0.000156).
Thus, from the meteorological variables, for solar radiation and relative humidity the null-hypothesis (H40) can also be rejected for that they have no effect on the flu-like epidemic lifecycle. But, of these two only solar radiation is a flu-like inhibitor in line with its positive effect on pollen concentration, and association with immune-activation and UV-effect on viruses.
A univariate linear regression also shows the highly significant negative correlation for solar radiation on flu-like incidence change (ΔILI3WMA) (F(1, 222) = 14.43, p < 0.001 (see Table 1, line 2). As the correlation is weak (Multiple R-squared = .06), we interpret solar radiation as a co-inhibitor in relation to pollen, that has stand-alone a too weak effect to explain flu-like seasonality.
Taking into account all these findings, we developed a discrete, compound model in which we take the changes in flu-like incidence (ΔILI3WMA), a threshold value for solar radiation (kr), and both pollen threshold values for allergenic (kap) and total pollen (kp). We found that the compound model (M = 1.4 thresholds passed, SD = 1.1 thresholds passed) has the highest inverse correlation (r(222) = -0.48, p < 0.001) for the following threshold values: kr: 510 J/cm2, kap: 120 allergenic pollen grains/m3, and kp: 610 total pollen grains/m3. In line with the previous outcomes, inclusion of relative humidity, low-allergenic pollen or temperature did not improve the correlation strength of this model. As they also did not show significant interaction effects with pollen, also such interactions are not meaningful to consider for the model.
In each of the observed years the now (re)defined pollen thresholds are being passed in week 10 (± 5 weeks) depending on meteorological conditions controlling the pollen calendar, which coincides also with reaching flu-like peaks, and again in week 33 (± 2 weeks) marking the start of the new flu-like season.
There is a highly significant inverse relation of our compound thresholds based predictor value with flu-like incidence change (ΔILI3WMA) of F(1, 222) = 65.59, p < 0.001 and a Multiple R-squared correlation of 0.2281 (see Table 1, line 3). This confirms the usefulness of a discrete, pollen and solar radiation thresholds based model as a predictor of switches in flu-like seasonality, whereby the effect of pollen is stronger than that of solar radiation. As a consequence we can reject the null-hypothesis (H30) that this compound pollen/solar radiation value has no predictive significance for flu-like seasonality.
4. Discussion
We first discuss the possible implications of the results for our theoretic model and alternative explanations. Next, we discuss our methods: alternatives or ways to improve them.
Theoretic model
We found highly significant inverse relations between pollen and solar radiation on the one hand, and (changes in) flu-like incidence on the other hand: a higher number of pollen or an increase in solar radiation, is related to a decline in flu-like incidence. The inverse correlation with pollen becomes stronger when including the 2019/2020 period, which is increasingly COVID-19 dominated during the last 9 weeks. This might be cautiously interpreted as an early indication that also COVID-19 is subject to pollen-flu seasonality as all previous pandemics since the end of the 19th century before. However, social distancing is likely to have contributed to flattening both the flu-like epidemic and COVID-19 pandemic curves at the tail-end of the 2019/2020 flu-like season: the Dutch government has applied hygienic measures since March 9, 2020, and a mild form of a lockdown with social distancing from March 11 on. Such behavioral policies will need to be included in the theoretic model in addition to pollen and meteorological variables to understand the relative importance of social distancing versus seasonality.
As also none-to-low-allergenic pollen – we called them low-allergenic pollen – has a positive effect on the hay fever index, we could not use it as an independent variable to discriminate between the immuno-activation explanation and the (“anti-viral”) bio-aerosol filter explanation in our model (see Figure 1).
The highly significant inverse correlation between hay fever and flu-like incidence can be interpreted in a number of ways, which are not mutually exclusive. A) Allergic rhinitis symptoms might make it more difficult for flu-like viruses to find their way to the lung cells that are vulnerable to it, such as ACE-2 receptor positive cells in case of SARS-CoV-2 (Wan et al., 2020). B) The anti-histamine hay fever medication possibly suppresses flu-like symptoms as well. Further, given the flower experiment mentioned in the introduction, it might make sense to look beyond allergies, and interpret immuno-activation in the broadest sense to include any auto-repair function of the human body.
The only meteorological variable that has a co-inhibitive effect on changes in flu-like incidence, solar radiation, has a stimulating effect on aerosol pollen formation, and is responsible for melatonin-induced immuno-activation. Relative humidity reduces pollen aerosol formation, and correlates positively with flu-like incidence. We did not specifically look at precipitation, but it might make sense to explicitly consider this independent variable as it reduces pollen dissemination.
In our study we showed that temperature, except for influencing pollen, has no predictive value for changes in flu-like incidence, and therefore its inverse correlation with flu-like incidence might be interpreted in a number of ways: a) as spurious: the common causal factor is solar radiation, or b) as a stressor that has immediate effects on the functioning of the immune system of already infected persons. If we talk about the influence of meteorological variables, we assume associated behavioral aspects being covered, which are sometimes summarized as seasonal behavior, but this independent variable might have a cultural dimension that needs to be understood better.
We showed that a compound value, based on threshold values for pollen and solar radiation, results in a stronger correlation with the flu-like lifecycle than the individual inhibitors. This model could form an empirical basis for understanding flu-like seasonality, its Ro and reliably predicting the start and end of each flu-like cycle. As behavior, in the form of hygiene and social distancing, is also widely seen as an inhibitor, it might be of value to include this factor in our compound value as well, which probably leads to an even stronger predictor for the evolution of the reproduction number Ro of flu-like epidemics, although this might be beyond explaining the seasonality effect itself.
As long as, for COVID-19, the level of herd immunity (Fine et al., 2011) is still below required thresholds for ending pandemics (Plans-Rubio, 2012), it might make sense to include indications of herd immunity levels in the theoretic model as well.
Methods
In general, statistical research cannot proof causal relationships in uncontrolled environments, even if datasets seem to behave as if there is causality. Such statistics, however, can provide indications and identify reliable predictors, help filter out bad ideas, and be the inspiration for testable hypotheses that can be verified in laboratory and other fully controlled experiments. Regarding statistical methods, other approaches for time series analysis and hypothesis testing could be applied, non-linear tests instead of our log10 transformations on logarithmic data or our discrete switching model, but in the end only replicational studies in other countries and controlled experiments are the best way to confront and verify our observations and hypotheses.
Regarding meteorological data, it might be useful to look if including more weather stations can help to better approximate the weather conditions the Dutch population is experiencing on average. It might be useful to distinguish patterns per province, and it might be of importance for understanding wind as the main vector for dispersal of pollen in The Netherlands with its maritime and temperate climate. It might also be good to look for the effects of climate change on pollen maturation (Frei & Gassner, 2008) as the climate in The Netherlands is affected.
Regarding the pollen dataset, it might be of interest when particle counts would include more pollen types than currently covered, to get a further understanding of the magnitude and impact of airborne pollen material, classified as allergenic or not. Such an expansion of pollen counting methodology would probably require an update of the EAN protocols.
Regarding the hay fever index, there are probably lag effects, but the data collection method for this finance based dataset, did not include assessment of validity and reliability, and thus there is no estimate of such a lag effect. It might be good to include alternative datasets such as search engine based trend analysis to be able to generate a complete dataset for a whole period of study, which can be validated separately.
The ILI dataset is now based on a sample of 40 representative local primary care units. It might be an improvement if the sample would be based on patients visiting any primary care unit in The Netherlands to reduce the need for extrapolations, which would likely improve the reliability of the dataset. Further, if it is confirmed that rhinovirus positive cases are positively correlated to pollen concentrations, it might make sense to remove these from the ILI metrics, as it seems that this effect cannot be generalized to other ILI viruses. The use of ΔILI3WMA including incubation time, seems to be more elegant than multiple tests with ΔILI. At the same time it could be argued that this metric could better become a 2 weeks forward-looking moving average, as it is unlikely that in the first week any effects can be noticed given the average reporting delays and incubation period.
We could more explicitly include behavioral variables (Gozzi et al, 2020), such as social distancing, lockdowns and hygiene, in the current method, for example by rating lockdown regimes on a Likert-type scale [1, 5], from no lockdown (1) to a complete lockdown (5). Although seasonal behavior might be implicitly covered by the meteorological variables, it might still make sense to model them more explicitly as there might be cultural patterns – such as holidays or seasonal celebrations – that need to be checked for.
Despite pollution not been seen as an inhibitor of flu-like incidence (Coccia, 2020), it still might interact with pollen. A more complete theoretic model, controlling for the (interactions with) pollution, can give more insight in how to interpret the findings of this or similar studies.
It will require further research to test the findings, threshold values and predictive model for flu-like seasonality in other countries with different climates.
5. Conclusion
We conclude that pollen and solar radiation both have highly significant inverse correlations with changes in flu-like incidence. The inverse correlation of pollen with (changes in) flu-like incidence becomes stronger when incubation time is included. A compound variable – based on the thresholds for total pollen, allergenic pollen and solar radiation – shows the strongest correlation with changes in flu-like incidence, and appears to be useful as predictor for switches in flu-like seasonality.
COVID-19 dominates the tail-end of the flu-like season in The Netherlands. And, COVID-19 appears to be subject to flu-like seasonality as well, like other pandemics before. This gives rise to the hypothesis that COVID-19 might be multicycle as well, and might thus resurface (Ro>1) from week 33 on, together with other flu-like viruses, when pollen season is over in the Northern Hemisphere, and will gain in force during Autumn and Winter when also solar radiation is reduced.
Controlled experiments are needed to confirm the interaction between pollen and viral bio-aerosol, and whether immuno-activation by pollen is indeed a causal factor in reducing the spread of flu-like viruses.
Data Availability
All datasets are based on public datasets in The Netherlands to which we have referred in the paper, and our datasets are available for review.
Acknowledgements
thanks to Sowjanya Putrevu, data scientist at Icecat, for her voluntary support with executing statistical tests.