Abstract
Objectives We aimed to predict the number of hand, foot, and mouth disease (HFMD) infections before and during the COVID-19 pandemic using Internet search data.
Methods We obtained actual HFMD cases from the National Institute of Infectious Disease and Internet search data using Google Trends between 2004 and 2021 in Japan. We calculated the Pearson correlation coefficients between actual HFMD cases and the search topic “HFMD” from 2004 to 2021. We conducted a cross-correlation analysis between the actual HFMD cases and 43 HFMD-related search terms before and during the pandemic. We identified the most significant predictors of HFMD infection using stepwise multiple linear regression.
Results We found that actual HFMD cases and Internet search volume peaked around July in most years, except for 2020 and 2021. The search topic “HFMD” presented a strong correlation with actual HFMD cases, but the correlation was weaker in 2004, 2008, and 2020. Results from stepwise multiple linear regression exhibited that the search terms “infect,” “daycare,” “vomit,” “HFMD,” “eczema,” “pain,” and “high fever” were the most significant predictors before the pandemic, while “infect,” “enterovirus,” “herpangina,” “kindergarten,” “myocarditis,” “HFMD,” “contact infection,” “blister,” “high fever” “dermatology,” and “plantar” were the most significant predictors during the pandemic.
Conclusions The predictors for HFMD infections before and during the COVID-19 pandemic were different. The awareness of HFMD infection in Japan may improved during the COVID-19 pandemic. Continuous monitoring is important to promote public health and prevent resurgence. Public interest reflected in information-seeking behavior can be helpful for public health surveillance.
Introduction
Hand, foot, and mouth disease (HFMD) is an infectious disease that results in a blistering rash on the mouth, hands, and feet. Most infected individuals recover from HFMD within a few days. Various comorbidities, including myocarditis, neurogenic pulmonary edema, acute flaccid paralysis, and central nervous system complications, such as meningitis, cerebellar ataxia, and encephalitis, can also occur (Huang et al., 2018; Wang et al., 2017). HFMD has been reported in many countries, particularly in the Asia-Pacific region (Puenpa et al., 2019). Corresponding infectious consequences have been reported over the past decade, especially in Japan (Sumi et al., 2017), China (Zhuang et al., 2015; Zheng et al., 2017), and Singapore (Chew et al., 2015). During the summer of 2011, Japan had the largest epidemic of HFMD on record, with 347,362 cases reported (Japan IDWR, 2009). Coxsackievirus A6 (CV-A6) infection was responsible for most cases, with co-circulation of coxsackievirus A16 (CV-A16) and enterovirus A71 (EV-A71) (Fujimoto et al., 2012). EV-A71 has been sporadically detected from October 2014 onward. It became the predominant serotype in 2018, with approximately 70,000 reported cases, following an increased spread from the end of 2017 (Kabele et al., 2022). Since June 2019, a severe outbreak of HFMD has occurred in multiple regions of Japan, attracting the public attention again (IDWR, 2019). As enteroviruses can spread rapidly by droplet and fomite transmission among children in daycare and kindergartens, predicting HFMD outbreaks is vital to public health (Sun et al., 2018).
Rapid recognition and reporting of HFMD infection is essential, and several studies have constructed models for predicting HFMD infection (Rui et al., 2021; Yu et al., 2021; Zhang et al., 2021; Gao et al., 2021). Rui et al. explored epidemiological characteristics and calculated the early warning signals of HFMD using a logistic differential equation (LDE) model in seven regions of China (Rui et al., 2021). Yu et al. forecasted the number of HFMD cases with wavelet-based hybrid models in Zhengzhou, China (Yu et al., 2021). Zhang et al. proposed a landscape dynamic network marker (L-DNM) to detect pre-outbreak signals of HFMD in Tokyo, Hokkaido, and Osaka, Japan (Zhang et al., 2021). Gao et al. used monthly HFMD infection cases and meteorological data to construct a weather-based early warning model with a generalized additive model across China (Gao et al., 2021). The above studies used a range of data, including monthly or weekly HFMD infectious cases (Rui et al., 2021; Yu et al., 2021; Gao et al., 2021), dynamic information from city networks, horizontal high-dimensional data, records of clinic visits (Zhang et al., 2021), and meteorological data (Gao et al., 2021). However, no studies have used Internet search data to predict HFMD infection. Traditional surveillance and reporting systems lag an outbreak by one to two weeks because of the reporting and verification process. In Japan, the National Institute of Infectious Disease (NIID) has monitored the outbreak of various infectious diseases and issued weekly reports since 1999. However, this traditional type of surveillance uses data from several weeks prior (IDWR Surveillance Data Table 2022 week, 2022). In contrast, Internet search data shows the information that the public is searching for in a more real-time manner, which may be valuable for infection surveillance.
The science of distribution and determinants of information in an electronic medium, specifically the Internet, or in a population, to inform public health and policy is defined as “infodemiology” (Eysenbach, 2009). Google Trends is frequently used in infodemiology research to gauge public interest (Mavragani et al., 2018). Google Trends reflects public information-seeking behavior and allows users to analyze Internet search data for specific search terms in any country or region over a selected period (Nuti et al., 2014; Mavragani and Ochoa, 2019). Studies have shown that online query trends correlate with real-life epidemiologic phenomena such as the flu (Pervaiz et al., 2012), sinusitis (Sharma et al., 2020), lifestyle-related disease (Memon et al., 2020), asthma (Mavragani et al., 2018), and pruritus (Tizek et al., 2019). Researchers have also investigated public interest and information-seeking behavior in chronic obstructive pulmonary disease (COPD) (Boehm et al., 2019), cancer (Schootman et al., 2015; Phillips et al., 2018), bariatric surgery (Linkov et al., 2014), kidney stone surgery (Dreher et al., 2018), and suicide (Taira et al., 2021). During the COVID-19 pandemic, similar studies using Internet search data were conducted to predict COVID-19 infectious cases (Husnayain et al., 2021; Higgins et al., 2020), explore public attitudes toward vaccination (Pullan and Dey, 2021; Diaz et al., 2021; An et al., 2021), identify symptoms caused by pandemics (Han et al., 2022; Kardeş et al., 2022; Knipe et al., 2021), and assess affected medical services (Cohen et al., 2021; Akpan et al., 2021; Adelhoefer et al., 2021). The above studies indicate that Google Trends could assist in gaining a better understanding and analysis of health information-seeking behavior. Information from Google Trends could potentially be used to supplement the current infection reports that have a lag time.
This study aimed to predict HFMD infection using Internet search data in Japan before and during the COVID-19 pandemic.
Methods
Data
We obtained actual HFMD cases from the weekly reports issued by NIID, which included new infectious cases and sentinel cases by prefecture and updated them from 1999 to the present (IDWR Surveillance Data Table 2022 week, 2022). As for Internet search data, we downloaded the relative search volume (RSV) of the “HFMD” search topic using Google Trends from January 1, 2004, to December 31, 2021. The normalized RSV data represented the search interest relative to the highest point for a given region and time. The scales of normalized RSV varied from 0 to 100, where 0 meant there were insufficient data for a term, while 100 was the peak popularity. We selected a search topic instead of the search term “HFMD” for comprehensive search information and limited the period from 2004 to 2021 because Google Trends provides data after 2004. In this study, we used the search topic “HFMD” and the search term “HFMD”. The weekly RSV of the search topic “HFMD” from 2004 to 2021 was gathered for further analysis. To identify significant predictors of HFMD infection, we selected 43 search terms including “HFMD” that reflected HFMD-related symptoms and risk factors. We downloaded the weekly RSV of these 43 search terms in two periods: before and during the pandemic. As Google Trends can provide data for up to five years on a weekly basis, we selected the five years before the outbreak (the first case of COVID-19 in Japan was confirmed on 16 January 2020) as period 1 (from 2015 to 2019) and the two years after the outbreak as period 2 (2020 and 2021). We set the geographic location to Japan and the category to health to limit irrelevant results. The word “child” has three common expressions in Japanese, so we divided it into three search terms “child (kanji),” “child (kana),” and “child (mixed).”
Statistical Analysis
SPSS for MAC (version 26.0, IBM Corp.) was used for statistical analysis. Initially, we calculated the Pearson correlation coefficient between the actual HFMD cases and RSV of the search topic “HFMD” each year from 2004 to 2021 instead of the whole period due to the periodic characteristic.
Second, we conducted a cross-correlation analysis between the actual HFMD cases and the RSV of related search terms. Cross-correlation is a measure of the similarity between two series as a function of the displacement of one relative to the other and was used to objectively estimate the time lag between cases of HFMD infection and related search terms (Bourke, 1996). We set the maximum lag to ±20 weeks due to the periodic characteristics of HFMD infection. We obtained 40 cross-correlation coefficients for each HFMD-related search term before and during the COVID-19 pandemic. Next, we selected the coefficients with the greatest absolute values and exhibited their true values. Finally, we compared the coefficients with the greatest absolute value in the periods before and during the COVID-19 pandemic. Regarding these coefficients, we assumed that negative, zero, or positive values with the greatest absolute value represented the search terms that occurred earlier, coincided with, or later than the actual HFMD cases, respectively.
Third, we conducted multiple linear regression to identify the most important Internet search terms for predicting HFMD infection before and during the COVID-19 pandemic. We included HFMD-related search terms for multiple linear regression as independent variables, with actual HFMD cases as potential dependent variables. To identify the most significant search terms and limit the number of independent variables, several common methods were used for independent variable selection, including forward selection, backward elimination, and stepwise regression (Tranmer et al., 2020). Stepwise regression alternates between forward and backward, bringing in and removing variables that meet the criteria for entry or removal until a stable set of variables is attained. We utilized the stepwise procedure to select the most significant search terms and reduce collinearity with intercorrelations. Collinearity is the correlation between independent variables that expresses a linear relationship in a regression model. When the independent variables are correlated in the same regression model, they cannot independently predict the dependent variable. The variance inflation factor provides a measure of the degree of collinearity, which is typically less than ten in the medical field. We excluded terms with a variance inflation factor of > 10. A two-tailed P-value of <.05 was considered statistically significant.
To assess the performance of the linear regression model, the coefficients of determination R2 or adjusted R2, which indicates how much variation in response is explained by the model, are often used (Akossou et al., 2013). The higher the R2, the better the model fits to the data. To avoid overfitting the model, we selected the adjusted R2 value for model evaluation.
Results
Basic description of the actual HFMD cases and search topic “HFMD”
In Figure 1, we present the actual HFMD and RSV cases from 2004 to 2021. Visual inspection of the figure indicated that both the actual HFMD cases and RSV of Google Trends peaked around July in most years except for 2020 and 2021. The number of HFMD infections surged after 2011, peaking every two years before 2020. The RSV coincided with this trend. In 2020, no periodic peak of infection was observed, whereas, in 2021, the peak was delayed to November.
As shown in Figure 2, we calculated the correlation between the actual HFMD cases and RSV each year from 2004 to 2021. The mean (standard error) of these correlations was 0.820 (0.052). Most of the correlation values were greater than 0.7, except for 0.562, 0.238, and 0.338 in 2004, 2008, and 2020, respectively.
Temporal correlation between the actual HFMD cases and Internet search terms before and during the pandemic
We performed a cross-correlation analysis to determine the temporal relationship between the actual HFMD cases and HFMD-related search terms. Cross-correlation results before and during the pandemic are presented in Table 1. The cross-correlation coefficient with the greatest absolute value was selected, and its actual value and corresponding lag were represented.
In period 1, search terms including “infect,” “coxsackie,” “nursery,” “mouth,” “child (mixed),” “cerebellar ataxia,” “acute flaccid paralysis,” “infection control,” “palm,” “hand-foot,” “excrement,” “eczema,” “pain,” “group infection,” “instep,” “fecal-oral transmission,” and “rash” presented one or more weeks earlier than actual HFMD cases. Search terms including “herpangina,” “summer infection,” “HFMD,” and “blisters” coincided with the actual HFMD cases. Search terms including “virus,” “entero,” “enterovirus,” “child (kana),” “infants,” “daycare,” “vomit,” “child (kanji),” “pediatrics,” “kindergarten,” “myocarditis,” “handwashing,” “contact infection,” “high fever,” “meningitis,” “droplet infection,” “headache,” “plantar,” “encephalitis,” “neurogenic pulmonary edema,” “dermatology,” and “fever” presented one or more weeks later than actual HFMD cases.
In period 2, search terms including “infect,” “coxsackie,” “nursery,” “child (mixed),” “acute flaccid paralysis,” “palm,” “excrement,” “eczema,” “pain,” “instep,” “fecal-oral transmission,” “rash,” “herpangina,” “blister,” “virus,” “entero,” “enterovirus,” “child (kana),” “daycare,” “child (kanji)” “pediatrics,” “myocarditis,” “handwashing,” “high fever,” “meningitis,” “headache” “plantar,” “encephalitis,” “neurogenic pulmonary edema,” “dermatology,” and “fever” presented one or more weeks earlier than actual HFMD cases. Search terms including “mouth,” “hand-foot,” and “HFMD” coincided with the actual HFMD cases. The search terms “cerebellar ataxia,” “infection control,” “group infection,” “summer infection,” “infants,” “vomit,” “kindergarten,” “contact infection,” and “droplet infection” presented one or more weeks later than the actual HFMD cases.
Compared with period 1, the temporal correlation between the actual HFMD cases and Internet search terms changed in period 2. The search terms “herpangina,” “summer infection,” “HFMD,” and “blisters” coincided with actual HFMD cases in period 1. However, in period 2, the search terms “HFMD,” “mouth,” and “hand-foot” coincided, while “herpangina” and “blister” were used earlier and “summer infection” later. In period 1, 17 (37.8%) search terms presented earlier than the HFMD cases, and 22 (48.9%) search terms presented later. In period 2, 31 (68.9%) search terms presented earlier than the HFMD cases, and only nine (20%) search terms presented later. Furthermore, we compared the coefficients with the greatest absolute value in the periods before and during the COVID-19 pandemic. 26 (67.4%) search terms were searched for earlier in period 2 than in period 1, might indicating higher public awareness of HFMD infections as well as COVID-19.
Predictors for actual HFMD cases before and during the pandemic
We identified the most significant predictors of Internet search terms for HFMD infection using multiple linear regression with a stepwise procedure. In Table 2, the search terms “infect,” “daycare,” “vomit,” “HFMD,” “eczema,” “pain,” and “high fever” were found to be significant predictors in period 1 and accounted for an adjusted R2=96.1% of the variation in HFMD infection. The search terms “infect,” “enterovirus,” “herpangina,” “kindergarten,” “myocarditis,” “HFMD,” “contact infection,” “blister,” “high fever” “dermatology” and “plantar” were significant predictors in period 2 and accounted for an adjusted R2=90.6% of the variation in HFMD infection (Table 3). More Internet search terms were identified for predicting HFMD infection in period 2, especially “enterovirus” and “contact infection”, which might indicate that the public attention affected by the COVID-19 pandemic.
Discussion
This study has presented the trends in actual HFMD cases and RSV of the search topic “HFMD” from 2004 to 2021. We found that the actual HFMD cases and RSV peaked around July in most years, except in 2020 and 2021, and surged after 2011 with peaks every two years before 2020. The search topic “HFMD” presented a strong correlation with actual HFMD cases, but the correlation was weak in 2004, 2008, and 2020. We conducted a cross-correlation analysis between the actual HFMD cases and related search terms before and during the pandemic. This indicated that the public might have improved awareness of HFMD infection during the pandemic. We used multiple linear regression to identify the most significant predictors of Internet search terms for actual HFMD cases before and during the pandemic. We identified the search terms “infect,” “daycare,” “vomit,” “HFMD,” “eczema,” “pain,” and “high fever” as the most significant predictors before the pandemic, and the search terms “infect,” “enterovirus,” “herpangina,” “kindergarten,” “myocarditis,” “HFMD,” “contact infection,” “blister,” “high fever” “dermatology” and “plantar” were the most significant predictors during the pandemic. Our research indicated that the Japanese public was more aware of HFMD infection during the COVID-19 pandemic. To our knowledge, this study is the first to predict HFMD cases using related Internet search terms and identify different predictors before and during the COVID-19 pandemic. This study also shows that Internet search data could supplement public health surveillance and help authorities respond to potential outbreaks rapidly.
From 2004 to 2021, the RSV of “HFMD” coincided with the actual HFMD cases in most years, except for 2004, 2008, and 2020. The data showed similar trends, with peaks occurring around July. However, in 2021, the peak was delayed until November. In 2004, 2008, and 2020, Google Trends search data did not match the actual “HFMD” cases. The search topic “HFMD” still presented a peak around July 2004, but the number of cases was low, which could result from the lower penetration of the Google search engine at the time (ComScore on top search engines for December 2004, 2005). In 2008, the search topic “HFMD” peaked on May 2-10 and July 20-26. The peak on July 20-26 was consistent with previous years in Japan, whereas the large outbreak in China drove the peak on May 2-10, which could be why the search topic “HFMD” presented a weak correlation with the actual HFMD cases. The latest epidemic occurred in China in 2008. More than 600,000 HFMD cases and 126 deaths in children were reported from March 2008 to June 2009 (Zhang et al., 2010), and Japan reported that this infection came from China (National Institute of Public Health, 2008). In 2020, the Japanese government implemented several measures to control COVID-19. Subsequently, the spread of HFMD did not occur as in previous years, and the search topic “HFMD” presented a relatively small peak in July. This may explain why the search topic “HFMD” did not correlate strongly with the actual HFMD cases in the first year of the pandemic. Despite a small peak in the search topic “HFMD”, the RSV was much lower than in previous years. Many researchers have shown that the Internet search data from Google Trends represents the public interest in a specific topic. However, Internet search data should be used cautiously as a surveillance system because large events can easily interfere with it. Obtaining fine-grained data could help develop surveillance systems that can effectively exclude biased or irrelevant information to predict infection outbreaks.
Our results indicate that the global pandemic might have enhanced public awareness of HFMD in addition to COVID-19. Firstly, the cross-correlation results showed that 31 search terms presented earlier than the actual HFMD cases, and only nine search terms presented later during the pandemic. Of the HFMD-related search terms, 67.4% showed an earlier lag than before the pandemic. Secondly, significant predictors for HFMD infections during the pandemic comprised more comprehensive information. Before the pandemic, search terms mainly including vulnerable area and symptoms of HFMD infection. Additional predictors were identified during the pandemic containing the information with disease-causing virus, infection pathways, susceptible body parts, complications, and visiting departments. Public might desire more comprehensive information of HFMD by Internet searching affected by the COVID-19 pandemic. According to previous studies, the prevalence of respiratory infectious diseases, such as influenza, varicella, herpes zoster, rubella, and measles, was reduced during the COVID-19 pandemic (Sakamoto et al., 2020; Wu et al., 2020; Wu et al., 2020; Kies et al., 2021; Veiga et al., 2021; Li et al., 2021; Wan et al., 2021). This might have been due to adherence to non-pharmaceutical interventions and lower non-polio enterovirus activity during the COVID-19 pandemic compared with 2014-2019 (Kuo et al., 2021). Switzerland had an unprecedented complete absence of pediatric enteroviral meningitis in 2020 (Stoffel et al., 2021). In Japan, community-acquired pneumonia (Yan et al., 2022) and influenza (Hirose et al., 2021) admissions have been reduced during the COVID-19 pandemic. COVID-19 preventative actions and better personal hygiene are beneficial for preventing the spread of disease. In the upcoming season, the prevalence of common diseases may rise as the public gradually complies less with infection control measures (Yan et al., 2022). Our results show that there was a peak in HFMD infection and public interest in November 2021. Continuous monitoring of HFMD is required, and public information-seeking behavior may be useful for public health surveillance.
Limitations
This study had several limitations. First, our findings are limited to those who used Google to search for health-related information, which may not represent the entire community. Second, the specific HFMD-related search terms we selected did not represent all search terms in public use. Third, search interest analysis is complementary to infection prediction and cannot replace traditional research methods because of its hypersensitivity to large events. Fourth, we used search data from 2015 to 2019 to represent the period before the COVID-19 pandemic due to restrictions of Google Trends, which may not represent the entire period. Finally, our study included only Japan; therefore, our findings might not apply to other countries.
Conclusion
We identified different predictors for HFMD infections before and during the COVID-19 pandemic. Our results indicate that the public may have had enhanced awareness of HFMD infection and paid more attention to viruses during the COVID-19 pandemic. It is critical to continuously monitor resurgent common infections as the public gradually reduces compliance with infection control measures. Public information-seeking behavior using Internet search query data may be useful for public health surveillance.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Declarations
Funding
This work was supported by the Japan Science and Technology for pioneering research initiated by the next generation (SPRING; grant number JPMJSP2110).
Ethics Statement
No ethical approval was required.
Availability of Data and Materials
We used publicly available data published by Google Trends and the National Institute of Infectious Diseases in Japan. All data generated or analyzed during this study are included in this published article (and its supplementary information files).
Competing Interests
The authors declare that they have no competing interests.
Authors’ Contributions
QN, JYL, MNT, and TA contributed to the study conception and design. QN, AB, KH, MO, and AK collected data. QN and JYL participated in data analysis and interpretation and drafted the manuscript. All authors contributed to the manuscript revision and approved the final version of the manuscript.
Acknowledgments
Not applicable.
Abbreviations
- HFMD
- hand, foot, and mouth disease
- EV-A71
- enterovirus A71
- CV-A6
- coxsackievirus A6
- CV-A16
- coxsackievirus A16
- NIID
- National Institute of Infectious Disease
- LDE
- logistic differential equation
- L-DNM
- landscape dynamic network marker
- COPD
- chronic obstructive pulmonary disease
- RSV
- relative search volume